Creativity and insight were on display as a panel of judges announced the winners of the Duke ASA DataFest: COVID-19 Virtual Data Challenge on May 5th. The contest, which took place from April 8th through April 22nd, encouraged Duke students to use data science to explore unique effects of the COVID-19 pandemic on daily life and different aspects of the social fabric of the United States.
DataFests, which were founded at UCLA by Dr. Robert Gould and are now sponsored by the American Statistical Association and held on campuses around the world, are described by the ASA as “….celebration[s] of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set.” This Virtual Data Challenge, co-sponsored by the Duke University Department of Statistical Science and the Duke AI Health Institute, was open to all undergraduate and master’s-level students across Duke.
Contest participants, working alone or in teams, were prompted to use publicly available data resources to gain insights into the cultural and societal impact of the global COVID-19 pandemic. The entries were judged by a panel of 15 experts drawn from academia and industry, with prizes awarded in categories that included “Most creative topic or data set”; “Best Visualizations”; “Best Interactive Dashboard”; “Best Insight”; and a “Judges’ Pick” award to recognize achievement outside of the other categories.
"I’m so impressed with the energy, enthusiasm, and insight our Duke students brought to the table in completing the DataFest COVID-19 Challenge!” said Duke Forge co-director and Sara & Charles Ayres Professor of Statistical Science Amy Herring, ScD, while commenting on the outstanding quality of submissions to the contest.
"Though this year’s challenge was virtual, many of the aspects that make DataFest special were still there,” said Assistant Professor of the Practice of Statistical Science Maria Tackett, PhD, who led the event. With support from the ASA DataFest Steering Committee, she helped transform DataFest, which under normal circumstances draws as many participants at Duke alone as a respectable statistics conference, into a successful COVID-focused virtual competition.
The entries tackled subjects ranging from the effects of the COVID shutdown on changes in air quality and mobility, to the differences in what we listen to on Spotify, and even to the vividness of our dreams. The analyses presented innovative data-driven insights that, according to Dr. Tackett, can be used to better understand many of the changes that have been observed or experienced in the past two months in the United States.
The COVID 19 Virtual Data Challenge Winning Projects
Duke’s Apoorv Jha won in the “Most Creative Topic or Data Set” category for “Dreams in the Time of COVID-19,” in which he explored how stressful times can directly affect how often and how vividly we dream, highlighting the unique opportunity that the ongoing contagion presents to the global science community to dive deep into the study and understanding of dreams and dreaming.
Shannon Houser and Jack Lichtenstein, who dubbed their collaboration “America’s Next Top Modellers,” took the “Best Visualizations” accolade for their presentation on social distancing in the United States. Using Google mobility reports, the team explored how factors such as population density, initial number of positive coronavirus cases per capita, governor’s political affiliation, and official shelter-in-place orders influence the magnitude of a state’s social distancing.
"All participants ranging from students in introductory statistics and data science courses to those in graduate-level courses honed their technical skills, learned how to collaborate with others remotely, and developed their presentation skills as they worked through the analysis and communicated their results,” said Dr. Tackett.
One example of this was Jie Cai’s “COVIS19” data visualization web app, which put COVID-19 and DATAVIS together to visually showcase the health and financial aftermath of the coronavirus pandemic, winning “Best Interactive Dashboard.” Her interactive visualizations tracked state-level COVID-19 health metrics with daily and weekly changes, as well as multiple stock valuations of various prominent publicly traded companies, allowing users to track how those valuations have fluctuated as reporting of health metrics changed over time. The project included an interactive personal stock checker that lets an individual compare how their own stocks of interest are faring vs. the market average.
A team comprising Duke’s Jingxuan Liu, Jessie Ou, Linda Tang, Yunyao Zhu, and Justina Zou, won the “Best Insight” award for their presentation, “Explore How Research Priorities Shift as COVID-19 Progresses,” which provided a clearer understanding of how the development and spread of COVID-19 affected research priorities and scientific development over time, with the general research focus shifting away from finding cues to seeking preventive measures to curb the spread of the virus.
Meredith Brown, Matt Feder, and Pouya Mohammadi’s analysis of COVID-19’s effect on different U.S. communities was selected in the “Judge’s Pick” category, which recognized outstanding achievement outside of the core award categories. The group’s work studied deaths per capita to highlight how the COVID-19 pandemic is disproportionately affecting low-income communities and people of color in the United States.
T"he projects were impressive in their creativity and depth as teams explored phenomena that are not always mentioned in the public conversation about the pandemic and social distancing,” noted Tackett. She added that DataFest attracts many participants who have not previously worked on data analysis projects outside of the classroom, so the contest gives them a glimpse of data science in practice and what it is like to analyze messy and complex real-world data.