Data Science Academy

The Wharton Data Science Academy will bring state-of-the-art machine learning and data science tools to high school students. We aim to stimulate curiosity in the fast-moving field of machine learning through this rigorous yet approachable program. Building up statistical foundations together with empirical and critical thinking skills will be the main theme throughout.


We believe data science is not just a collection of techniques; it is foremost motivated by real-world problems. The data scientist of the 21st century must be able to identify relevant problems, provide sensible analyses, and ultimately communicate their findings in meaningful ways. By the end of the Wharton Data Science Academy, students will not only be equipped with essential data science techniques such as data visualization and data wrangling but will also be exposed to modern machine learning methodologies, which are all building blocks for today’s AI field. Along the way, students will develop a working proficiency with the R language, which is among the most widely used by professional data scientists in both academia and industry.

What students can expect:

  • Wharton instructors who are data science experts will lead the lectures and will also be available to students outside of class.
  • Students will advance their skills with data from real-world cases and will be challenged to articulate their findings with a final project.
  • Wharton undergraduate and graduate TAs will engage with students and share their experiences in studying data science.
  • Guest speakers who will share their wisdom of data science as a career.
  • Students will work in teams to complete a final project and present to their peers at the end of the program.


All participants who complete the program will earn a Wharton Global Youth Certificate of Completion.


Academic classes are held Monday-Friday with extracurricular activities available in the evenings and on the weekends. Students move in on Sunday pre-program, and move out the final Saturday of the program. For more information on campus life, visit our residential experience page.

While each day varies slightly in format, a typical day includes:

  • 9:00-9:45am – Discussion on current events
  • 9:45-10:00am – Mid-morning break
  • 10:00- 11:30pm – Morning topics lecture
  • 11:30-1:30pm – Lunch
  • 1:30-3:00pm – Topics lecture or guest speaker
  • 3:00-3:15pm – Afternoon break
  • 3:15-4:30pm – Recitations/group work

Session topics may include:

  • Acquiring, preparing, exploring, understanding, and visualizing data
  • Foundations of probabilities and statistics
  • Model-based modeling
  • Machine learning

In the evening, students will have a number of extracurricular activities to choose from.

Students can also opt to work on their final project with their group, meet with the program TAs, and/or relax at the dorm. Please note, some days may not follow this schedule as there could be a site visit off campus or a simulation in lieu of lecture/recitation schedule.



High school students currently enrolled in grades 10-11 with a strong background in math and coding, and interest in data analytics. Previous understanding of statistics is preferred. Students must be open to the challenge of a rigorous curriculum similar to that of an intermediate Wharton undergraduate course. International applicants are welcome.


Admission to the Data Science Academy is selective. Wharton will select approximately 75 students to participate in the Academy. Selections are based on a record of academic excellence and a demonstrated background in mathematics and/or statistics. Interested students are strongly encouraged to submit an application by the priority deadline.

Please note that participation in the Data Science Academy does not guarantee admission into Penn.

Instructional Team

Program Leader: Linda Zhao

Linda Zhao is a full professor of statistics in the Wharton School. As an expert in machine learning, she has been teaching a modern data mining course to undergraduate, MBA, Master, and Ph.D. students throughout the entire Penn campus. Students comment that her data mining course is one of the most fun and useful courses offered at Penn. In addition to teaching regular Wharton students, Linda served as a co-director of the Wharton–SAC (Securities Association of China) executive program, which she successfully ran and taught. By teaching various levels of students, Linda is able to design and deliver state-of-the art machine learning skills to students from all different backgrounds. A fellow of the IMS, a leading international statistics organization, Linda has been actively engaged in her academic career. Her specialty falls in modern machine learning methods, replicability crisis in science, high dimensional data, housing price prediction, and Bayesian methods. Her work has won the NSF support for over 20 years. She is also an avid ballroom dancer and she loves to travel around the world. 

Teaching Assistants

Teaching Assistants consist of both undergraduate and graduate students from the University of Pennsylvania. TAs facilitate small-group discussions, lead small-group lab work, ensure student understanding, assist with final project development, and hold office hours to answer student questions.

“My favorite part of the Data Science Academy was the final group project where my team and I were able to put our statistical learning skills to the test with a completely new set of data!” - Ramya S., California, USA