As previously posted by Digital Science, The STEM Fellowship Big Data Challenge is a unique endeavor and learning experiment. Focused on the development of the new generation of students’ natural data analysis talents, it  is a creative and inspirational grass-roots event that Digital Science, Altmetric, Overleaf and Figshare were proud to support. The competition aspect added a nice buzz to the event, with four different sponsored prizes available for the winners – including a $1000 prize from Digital Science.

The brains behind the event, Educator, Executive Director and Editor in Chief of STEM Fellowship, Dr Sacha Noukhovitch  said:

“The 2017 Big Data Challenge for high school students was a true implementation of the new 21st-century pedagogy that revealed the natural data-analytics talents of this generation of students. Inspired by Altmetric scholarly impact data and attention, students took an initiative in defining the purpose of their inquiry and crowd-sourced the knowledge to define the trends and patterns in modern academic research.  Moreover, Altmetric impact data combined with data analysis tools from IBM Big Data University and SAS have become a next generation learning environment driving and guiding students’ interdisciplinary inquiry to new heights of knowledge building.”  

As one of the early founding judges, I was lucky enough to attend the Toronto based event on February 24th and see the eight high school finalist teams make their presentations, followed by a detailed Q&A with the judges. The students did an amazing job preparing high-level presentations that talked through their findings from interrogation of Altmetric data.

Each team had 15 minutes on the podium presenting to a full house of some 60 people culminating in a Q&A. During the presentations the teams had to explain how they used and applied different data analysis techniques, talking through their findings and conclusions. The teams each took a slightly different approach to interrogating the Altmetric data and the high level topics included: deep dives into attention to cancer and diabetes research fields, identifying user behavior of Mendeley and Twitter users, gender diversity in Altmetric attention, looking deeper at funding compared to attention, inter-disciplinary studies, the Altmetric Top 100, and mapping forest fires in Canada.

Before the winning teams were announced, the Right Honourable Arnold Chan, Member of the Canadian Parliament gave a short speech on the value of educating tomorrow’s science innovators of the future. After his talk, Chan then presented the prizes to the winning teams:

  • SAS monetary prize: Tanenbaum CHAT and team members Seth Damiani, Ronny Rochwerg, David Roizenman and Joseph Train. The purpose of their paper was to determine which social networks have the most engagement about articles related to oncology. The paper also looked at the most optimal times to share articles on social media networks.
  • The Digital Science – Altmetric prize: Earl Haig Secondary School, Tony Xu, Cynthia Deng, and Shayan Khalili, their presentation talked through the relationship between the number of Twitter and Mendeley views of scientific articles, and dug deeper into issues such as the GDP per capita of the country where the views came from, length of article title compared to attention, and tried to uncover other reasons or correlations that potentially affect the level of attention for scientific articles. The team said:
“Initially, the five hundred thousand JSON files worth of data was intimidating and analysing it seemed like an improbable task. This quickly changed as once a clear focus was established, parsing through the files to gather what was needed for our report was quite straightforward. Through programming languages, such as R and Python, and data analysis programs, such as Excel, we were able to search through the files of the abstracts for keywords relating to the different types of cancer. The frequency of which different types of cancers appeared in the files quantified the amount of research attention that individual cancers received. Using this, we compared the amount of research attention to the death and diagnostic rates of the different types of cancers and determined that breast cancer was receiving more attention than warranted while lung cancer less attention than warranted.”
  • IBM Big Data University prize: Earl Haig Secondary School, Chandler Lei, Haolin Zhang, Peter Chou and Kevin Hong. This paper analysed the correlation between cancer research trends and real world data. The team reviewed the amount of research papers pertaining to different types of cancers and compared against mortality and diagnosis rates to determine the research attention towards a type of cancer in relation to its overall danger level to the general population. The team said:
“I am beyond grateful to the sponsors and organisers of the Big Data Challenge for dumping me into the world of data analytics. Overcoming obstacles on the journey of self-learning gave me a new appreciation for the importance of passion and its intrinsic relationship with perseverance, happiness, and success. The Big Data Challenge has been an incredibly rewarding adventure. The ability to iterate through 550 000 scientific articles to answer a question is akin to a superpower.”
  • SAS prize (tickets to the Raptor’s game to the corporate booth): Pierre Elliot Trudeau High Schools’ Leon Chen, Curtis Chong, Emily Huang, and Nathan Lo.  This study aimed to determine the effects of climate change on forest fire trends in Canada by measuring correlations between weather conditions and the frequency and size of forest fires.

The winners will be publishing their full length articles in the STEM Fellowship journal hosted by Canadian Science Publishing.

The event was a huge success, not just for the students as Derek Montrichard,  Director, Credit Risk Modeling, Risk Management, CIBC (Canadian Imperial Bank of Commerce) noted:

“Any presentation that has me thinking afterwards is a very good presentation as far as I’m concerned. The overall intent of research is to get your audience to think and re-evaluate their position, and try to think for themselves what it all means, and the team did just thatI certainly did want to pass on my praise for all the work everyone did to make this event happen. I’m still blown away on how they were able to parse the data, and thankful some of them included code! .”

The STEM Fellowship Big Data Challenge is an annual event and there was some talk that next year’s event may focus on new machine learning technologies. Personally, I can only thank and congratulate Dr Noukhovitch and his team for such an inspirational and educational event. It’s clear a great deal of learning occurred in a fun and competitive team environment, I’m sure all the participants will remember and use skills learned during this event for many years to come!

Group picture of all 8 finalist teams at the STEM Fellowship Big Data Challenge


From left to right Dr Sacha Noukhovitch  STEM Fellowship, Adrian Stanley Digital Science, and Mohammad Hossein Asadi Lari STEM Fellowship