New Student Cohort Joins DCRI-Duke Forge Health Data Science Internship Program

May 23, 2018
2018-2019 interns with their mentors. From left to right: Qi (Dylan) Liu, Yiwen Liu, Yingzhou Liu, Allison Dunning, Robert (AJ) Overton, Matias Benitez, Matt Phelan, and Yan Zhao

This January, the joint DCRI-Duke Forge Health Data Science Internship Program welcomed its second cohort of interns. The five Duke students chosen for the program are pursuing master’s degrees in fields ranging from statistical science and biostatistics and bioinformatics to economics and computation. With the departure of the inaugural cohort of interns, the current group of interns has now made the transition to larger roles on Forge demonstration projects.

“We’re fortunate to have interns with diverse skills and backgrounds that will enrich the program and the projects,” said Silvana Lawvere de Moreno, PhD, associate director of business operations for DCRI Statistics. “I believe the new interns will benefit from the momentum created by the graduating interns and I’m excited to see how they will utilize both statistical and machine learning methods on their projects.”

Interns will continue their work for the next year, finishing the internship program as they graduate from their respective master’s programs. Over the summer they will work full-time as part of transdisciplinary teams on cutting-edge health data science projects. The HDS program offers a unique opportunity for enrolled interns. The program comprises an innovative partnership between the Duke Clinical Research Institute, which manages the students and administers the internships, and Duke Forge, which convenes and catalyzes the projects. Each HDS intern is paired with a statistical mentor from the DCRI’s Center for Predictive Medicine and has regular one-on-one meetings with Forge quantitative faculty members.

Dr. Peter Merrill, a biostatistician at the Center for Predictive Medicine and a mentor in the HDS internship program, commented: “In the first year, the program gave the initial cohort of interns the new experience of working with ‘real world’ data in a collaborative environment. I saw the interns improve their knowledge in the machine learning field while simultaneously learning important job skills like managing a schedule and making presentations. Our new cohort of interns has already proven themselves to be very knowledgeable and talented. I look forward to working with them over the next year and helping them develop into leaders in this exciting new field.”

Meet the 2018-2019 HDS Interns

Matias Benitez, Master of Science in Economics and Computation, Class of 2019


Matias obtained his bachelor’s degree at the National University of Asunción in Paraguay. Prior to joining Duke, he worked in economic research for several years, where he developed an interest for statistical methodology and a desire to engage in research in a high-impact field. The HDS internship program affords him the opportunity to learn the complete process of a data science project, from data wrangling to the design and implementation of machine learning models. Matias is interested in neural networks and their application to natural language processing. After graduation, he plans to pursue a career in data science with focus on this topic.

Qi (Dylan) Liu, Master of Statistical Science, Class of 2019


Before coming to Duke, Dylan earned his bachelor’s degree in economics at Zhejiang University in China. While an undergraduate, he developed an interest in data science during internships in the financial industry and while conducting research in econometrics. He immediately applied for the HDS program after seeing the opportunity posted on the Forge website. Dylan has a passion for applying his knowledge of statistical models to real-life projects, especially ones that involve significant issues such as healthcare. He is currently working on the “Goals of Care” project, which involves natural language processing, an area of particular interest for him. Dylan is enjoying his internship at DCRI, where he can learn not only statistics and programming skills, but also how to communicate effectively and collaborate with team members. He is enthusiastic about becoming a data scientist in the healthcare industry in the future, and he believes his experience as an intern in the HDS program will serve as an invaluable asset, bringing him closer to his career goals.

Yingzhou Liu, Master of Biostatistics, Class of 2019


Prior to arriving at Duke, Yingzhou earned his bachelor’s degree in biological sciences from China Agricultural University. During his undergraduate study, he not only obtained a solid blend of biological knowledge but also received extensive exposure to the use of “big data” in biology and health fields. Through this exposure he learned multiple statistical methods, developed a variety of programming skills, and found his passion in applying cutting-edge methodologies to real problems. Yingzhou learned about the HDS internship program through interactions with senior interns and followed up by reading about the program online. The program allows him to apply his skills to real data while gaining hands-on experience in exploring the application of various methods to massive datasets. He is currently focusing on processing CT scan images using deep learning methods to develop an algorithm that can be used to identify organ anomalies based on serial CT scan images. Over the course of this novel and challenging project, Yingzhou has also learned how to collaborate with statisticians and radiologists. In the future, he will continue solving data science problems and hopes to discover the valuable information hidden in real-world data.

Yiwen Liu, Master of Biostatistics, Class of 2019


Yiwen obtained her bachelor’s degree from Mount Holyoke College as a double major in biochemistry and statistics, and her first master’s degree in biochemistry and cellular biology from Rice University. She first became interested in health data science while learning about deep sequencing. When the HDS program came to her attention through departmental emails and conversations with senior interns, she saw it as a great opportunity to acquire hands-on experience with real data and machine learning methods. She is interested in integrating clinical data from different sources (labs, procedures, demographics) and compressing data into low-dimensional structures to predict disease outcomes using machine learning algorithms. She is currently working on the “Neonatal Intensive Care Unit” project, which uses demographic, metabolomics, and microbiome data to predict growth failure and necrotizing enterocolitis in infants. Her project allows her to use both traditional biostatistics and bioinformatics tools as well as innovative machine learning and deep learning algorithms in predictive modeling. With the HDS internship experience in clinical data, she is more confident about her future career as health data scientist.

Yan Zhao, Master of Statistical Science, Class of 2019


Before coming to Duke, Yan Zhao earned her undergraduate degree at UCLA, majoring in applied math. After graduating, she worked full-time as a business analyst at Amgen for 8 months. Her passion for healthcare started in high school with a dream of becoming a doctor to help people and save lives. Discovering that her strengths lay in numbers and logic, Yan decided to leverage her analytical skills and apply them to healthcare and applied to the HDS program after hearing about it from a senior HDS intern. Over the past semester that she has been engaged with the program, Yan has enjoyed applying the lessons learned in her coursework to real-world problems. She is currently interested in using machine learning methods to improve disease identification. Yan feels that her programming skills have already significantly improved. She hopes that in the future, the models she develops can be deployed in real-world settings.