Subscribe to our newsletter

Harvest Tiny Data in Scientific Papers

17th May 2016
 | Guest Author

geo3Xuan Yu is a postdoctoral researcher at Department of Geological Sciences at the University of Delaware. He has been named a distinguished lecturer by EarthCube. In this role, Yu is giving lectures at universities and research institutions across United State on best practices for data management in geoscience publications and on the benefits of open science engagement.

As an early career geoscientist, I am passionate about the future of geoscience. Especially during the information age, traditional geoscience is transforming towards community-driven, collaborative, and open research culture.

Currently, we use software and code process and visualize data and then embed the plots in our journal papers. These software and data (including intermediate data, analysis pipelines) could be as tiny as few lines of code, numbers or letters. Without access of “tiny data”, readers can hardly reproduce and reuse the entire research.

 “According to our reproducing and learning procedure, we realized that what might be trivial for one modeling group may not be for the other.” —Alain N. Rousseau, Institut national de la recherche scientifique, Canada

geoFigure 1 A geoscience paper of the future will include both a traditional part and digital part.

A modern publication strategy utilizes data repositories and computational workflow to link digital files and scientific stories. By applying such publication strategies, data are stored in public repositories (e.g. Figshare, Github), and are linked from any detailed pieces of code to each plot in the final paper. Such geoscience paper will not only tell the story through traditional publication methods (e.g. text, figures, and tables) but also make digital research outputs (e.g., software, data – including raw, intermediate, and final data) persistent, linked, user-friendly, and sustainable (Figure 1). As a result, potentially large groups of readers from scientific communities, governments, and the public will be able to understand the knowledge and reuse the data for their own purposes.

“Copper as a mirror, one can dressed.

History as a mirror, one can know the rise and fall.”— Emperor Taizong of Tang

Paper as a mirror, research can be understood and reused.

 

geo2Figure 2. An anonymous comment after the EarthCube Distinguished Lecture at (United States Geological Survey) USGS on world water day, 2016

To engage the wider research community, a lecture series enabled by the US NSF EarthCube Distinguished Lecturer Program was initiated in Spring 2016. The feedback during the lectures demonstrated that most scientists were convinced on the importance of data preservation (Figure 2), though there are still specific difficulties on digital science publication from case to case. If you are interested in finding out more, please click the hashtag: #whyearthcubedistinguishedlecture. As more conversation are generated, we hope to explore and improve the best practice for shaping new digital science norms with a fair and open research culture.