For the last few months, our metrics team, headed by Caitlin Trasande, have been hard at work on a project with Nature Publishing Group. Yesterday, the fruits of that project went live – the Nature Publishing Index Global Top 50 – a ranked list of the top 50 institutions in the world, according to how many primary research articles in Nature research journals.

Caitlin joins us here on the Digital Science blog to tell us more about the work involved in pulling this index together – not an easy nor straightforward task. Science metrics is an area of the research process that’s riddled with inefficiencies and ambiguity, yet another area that can benefit from better use of technology and technical know-how. The Global Top 50 represents the first example of what can be done in this space, in pairing open and meaningful data from the science publishing world with information and metrics from other resources to surface new insights – in this case, the number of Nature articles published by institution. 

 And with that, I’ll hand the mic over to Caitlin:

Technically, this begins as a task of deciphering and mapping of the myriad institutional names (and forms) used by authors in Nature research journals and in many cases, harmonising their affiliate institutions and campuses. Take for example “Max Planck Institute” – a network of nearly 80 research institutes in Germany. In some cases, a judgement call has to be made about when to aggregate data from those institutes, or, in the case of individual campuses in the University of California system, when to let those institutes stand alone. The data is messy, presenting matching issues alongside scalability challenges, which the Metrics team has been diligently working to make better sense of in collaboration with Nature Publishing Group.

The organisational structure for modern scientific research institutions is complex, often poorly documented (if at all), and multi-dimensional – now including more traditional divisions based on college or discipline, specialised affiliate institutions, satellite institutes, affiliated research hospitals, and so forth. Defining an institute’s boundary is challenging. Mapping journal articles and authors to that information, as a result, calls for a clean baseline data set to start with. Which was only the starting point for the Metrics team.

The Global Top 50 gave us the opportunity to peek under the hood and see first hand what the challenges were in this sort of task, helping us gain a better understanding and stronger technical grasp on how we can help in this space, providing better, more representative tools and information for the scientific community. Do stay tuned. 

Also, we’d love to hear what you think about the index. Feedback can be sent to institutes@digital-science.com. For more on the index, check out Nature Publishing Group’s press release.

Kudos again to Caitlin, Dan Hagon and Johannes Goller on the Metrics team for their tremendous work on this project. Keep up the good work!