The essence of data science is the study of data to extract useful insights for business.
During the 2023 Women in Data Science@Penn Conference, held at the University of Pennsylvania’s Perry World House, Linda Zhao, a Wharton School professor of statistics and data science, put it like this: “You want to solve real-world problems? Get the data, make sure you have clean data, do the analysis, and make a story out of it. It has to make sense and you have to be able to present it beautifully to a group of people.”
This is Professor Zhao’s mantra as she introduces high school students to a data-centric mindset during the Data Science Academy, a summer program held on Wharton’s Philadelphia campus each July.
Gas Price Trends
At the recent WiDS Conference, sponsored by the Wharton School and Penn Engineering, a team of high school students who attended Zhao’s summer academy traveled to campus from around the world to show off their data-science dexterity, dazzling the crowd with their problem solving around a timely economic issue: escalating gasoline prices.
The four students — Karen W. from New Jersey, Brian L. from California, Jennifer L. from Texas and Christine L. from Hong Kong — detailed their data mining process and findings in the presentation “Hey, What’s Up? Gas Prices: Analyzing the Influences of U.S. Gas Price Trends.”
“When we started our investigation this past summer, gas prices in the U.S. were at an all-time high with very high fluctuations, as well,” said Christine, a junior at Hong Kong International School. “Recent events such as the Russian-Ukraine War and Covid-19 were impacting U.S. retail gas prices dramatically. Many Americans were feeling the effects of these expensive costs and couldn’t help but wonder: What factors are affecting these prices? We thought it might be interesting to look at what was causing this unprecedented rise.”
During their study, the group made use of several data-science techniques, including multiple linear regression, LASSO, text mining and random forest, to help them drill down on two key questions:
- What factors affect gas prices?
- Why are gas prices so high?
The exploration began with research into gas supply chains to identify what categories of variables, or factors that could be measured, the team might need to predict gas prices. Within four broad categories – economics, energy, weather, and Google search trends for the words oil and gas – they assembled 38 different variables into one total dataset that included longitudinal data from January 2000 to June 2022.
For a detailed discussion of the team’s data mining and research, we encourage you to watch the students’ presentation on YouTube.
Cold Weather, Canadian Imports and CO2
Ultimately, however, the team used a selection model known as Lasso to arrive at the 10 most significant variables for their data analysis of gas prices.
“Our 10 factors gave us some interesting results,” said Christine. “We found that when U.S. medium income increased, gas prices would increase by $0.09 per gallon. This is because when people’s income increases, they have more money to consume gas and gas-related products like cars, therefore increasing the demand for gas and raising gas prices.”
“We also found that when oil imports to the U.S. increase, gas prices would also increase,” added Brian, a senior from Lynwood High School in California. “That’s because when oil imports to the U.S. increase, gas prices have to reflect the increased supply chain and transportation costs, leading to higher prices.”
The group also studied the variable of operating oil rotary rigs, which is basically the hardware that drills for oil. They found that an increase in operating oil rotary rigs also led to an increase in gas price. “We hypothesized that this is because a higher amount of rigs in operation signifies higher demand for gas prices, thus increasing gas prices,” noted Brian.
With these, as well as other findings that they touched on during their presentation, the team gained insight into why gas prices were so high, and how to potentially lower gas prices in the future. Here are their key project takeaways:
- Gas prices are generally lower during winter months. Planning around this fact might minimize losses.
- Importing more gas into the U.S. generally raises prices, with the amount imported from Canada having the biggest impact of them all. If you see a news headline while surfing the web or extremely high gas imports in one month, more expensive gas prices should be expected.
- CO2 emissions of the commercial, electricity and transportation sectors affect gas prices. If the U.S. is able to lower the electricity sector’s CO2 emissions, lower gas prices are likely to result.
- Other factors like politics and current events also play a large role in determining the price of gas.