Aspiring Data Scientists Research the Rise in Gasoline Prices

by Diana Drake

The essence of data science is the study of data to extract useful insights for business.

During the 2023 Women in Data Science@Penn Conference, held at the University of Pennsylvania’s Perry World House, Linda Zhao, a Wharton School professor of statistics and data science, put it like this: “You want to solve real-world problems? Get the data, make sure you have clean data, do the analysis, and make a story out of it. It has to make sense and you have to be able to present it beautifully to a group of people.”

This is Professor Zhao’s mantra as she introduces high school students to a data-centric mindset during the Data Science Academy, a summer program held on Wharton’s Philadelphia campus each July.

Gas Price Trends

At the recent WiDS Conference, sponsored by the Wharton School and Penn Engineering, a team of high school students who attended Zhao’s summer academy traveled to campus from around the world to show off their data-science dexterity, dazzling the crowd with their problem solving around a timely economic issue: escalating gasoline prices.

The four students — Karen W. from New Jersey, Brian L. from California, Jennifer L. from Texas and Christine L. from Hong Kong — detailed their data mining process and findings in the presentation “Hey, What’s Up? Gas Prices: Analyzing the Influences of U.S. Gas Price Trends.”

“When we started our investigation this past summer, gas prices in the U.S. were at an all-time high with very high fluctuations, as well,” said Christine, a junior at Hong Kong International School. “Recent events such as the Russian-Ukraine War and Covid-19 were impacting U.S. retail gas prices dramatically. Many Americans were feeling the effects of these expensive costs and couldn’t help but wonder: What factors are affecting these prices? We thought it might be interesting to look at what was causing this unprecedented rise.”

Jennifer L, Brian W., Karen W. and Christine L. following their WiDS debut.

During their study, the group made use of several data-science techniques, including multiple linear regression, LASSO, text mining and random forest, to help them drill down on two key questions:

  1. What factors affect gas prices?
  2. Why are gas prices so high?

The exploration began with research into gas supply chains to identify what categories of variables, or factors that could be measured, the team might need to predict gas prices. Within four broad categories – economics, energy, weather, and Google search trends for the words oil and gas – they assembled 38 different variables into one total dataset that included longitudinal data from January 2000 to June 2022.

For a detailed discussion of the team’s data mining and research, we encourage you to watch the students’ presentation on YouTube.

Cold Weather, Canadian Imports and CO2

Ultimately, however, the team used a selection model known as Lasso to arrive at the 10 most significant variables for their data analysis of gas prices.

“Our 10 factors gave us some interesting results,” said Christine. “We found that when U.S. medium income increased, gas prices would increase by $0.09 per gallon. This is because when people’s income increases, they have more money to consume gas and gas-related products like cars, therefore increasing the demand for gas and raising gas prices.”

“We also found that when oil imports to the U.S. increase, gas prices would also increase,” added Brian, a senior from Lynwood High School in California. “That’s because when oil imports to the U.S. increase, gas prices have to reflect the increased supply chain and transportation costs, leading to higher prices.”

The group also studied the variable of operating oil rotary rigs, which is basically the hardware that drills for oil. They found that an increase in operating oil rotary rigs also led to an increase in gas price. “We hypothesized that this is because a higher amount of rigs in operation signifies higher demand for gas prices, thus increasing gas prices,” noted Brian.

With these, as well as other findings that they touched on during their presentation, the team gained insight into why gas prices were so high, and how to potentially lower gas prices in the future. Here are their key project takeaways:

  • Gas prices are generally lower during winter months. Planning around this fact might minimize losses.
  • Importing more gas into the U.S. generally raises prices, with the amount imported from Canada having the biggest impact of them all. If you see a news headline while surfing the web or extremely high gas imports in one month, more expensive gas prices should be expected.
  • CO2 emissions of the commercial, electricity and transportation sectors affect gas prices. If the U.S. is able to lower the electricity sector’s CO2 emissions, lower gas prices are likely to result.
  • Other factors like politics and current events also play a large role in determining the price of gas.

5 comments on “Aspiring Data Scientists Research the Rise in Gasoline Prices

  1. Just a few years ago, during and after the COVID-19 pandemic, my family had just moved out of my home state of Georgia in the United States. I was excited to start my new life and live in a new environment surrounded by a unique culture. Despite my excitement, I quickly discovered that the transition was more challenging than my family thought. One of the biggest challenges I faced was the rising costs of products such as gasoline. We had a limited budget and couldn’t afford to buy everything we required. We had to make tough choices and prioritize my spending on groceries, cleaning products, automobile costs, and education.
    I remember once going to the grocery store to buy food. I had a list of things my family needed, but when I got to the store, I realized I couldn’t afford everything. I had to choose between buying bread or milk. It was a tough decision, but I purchased bread because it was cheaper. I felt frustrated and overwhelmed by the rising prices of products. It seemed like the costs increased again every time I went to the store. I didn’t know what to do. These rising prices seemed to continue for an eternity, and my mother and father were already working two jobs. I decided to talk to some friends back home about my situation. They gave me some great advice on saving techniques and making more money even though the oil prices in Georgia were significantly lower than those in my new state. They told me about vouchers, sales, and other practices to save money on groceries. They also recommended I look for a part-time job or start a side hustle to make more money. I took their recommendation and started looking for ways to conserve and make more money. I got a job as a math tutor and learned to manage my money efficiently. It wasn’t easy, but I overcame the challenges of rising product prices with hard work and determination. Years later, I look back on that time and realize it taught me a necessary lesson. It taught me that with hard work and perseverance, you can overcome any challenge that comes your way. Even though oil prices have decreased for multiple reasons, such as the near end of a significant supply chain break, the end of the COVID-19 pandemic, the Russo-Ukrainian War, and the push for electric vehicles in the United States, I look back and realize that the need to pay for gasoline has allowed me to become a more dedicated individual.

  2. “Why are gas prices so high? I don’t care how we lower them, let’s just lower them!”

    I heard this phrase a lot in the last year, but not by my parents, or in fact anyone who purchased gas. It was instead by my classmates – 14 year old teenagers arguing on how they would run the economy and help lower gasoline prices. To them, it didn’t matter what else would be impacted, as long as gas prices got lowered, even if it meant that the USA kept buying gas from Russia. In fact, when it was announced that the USA stopped all imports of gas from Russia, many classmates criticized this decision.

    I, however, did not stand for this. The butchering, raping, and destruction of democratic citizens by the Russian army and government is not something to be tolerated for our own personal comfort. It made me embarrassed and angry to be a part of this school, and to be around people that don’t care if innocent people are dying to uphold democracy.

    Although it is still very unpleasant to be around people like these, it made me realize how privileged and out-of-check people are nowadays. This learning experience pushed me to understand what I wanted to achieve later in life, and that is to be an entrepreneur with the goal of helping people, especially those who are not as fortunate as us in America.

    What all of us can take away from this is that the smallest scenarios allow for a deeper understanding of life, and we should never be afraid to learn more.

    • Hello Patrick,

      I think it is respectful that you have a strong concern about the issue of high gas prices. I totally understand your frustration with your classmates’ comments on lowering gas prices without considering the consequences. I like how you look beyond the short-term benefits and consider the larger impact of the decision to lower gas prices.

      However, I think it’s understandable that many people always want lower prices to save more money. At the same time, it’s also crucial to identify the broader impact of those demands. For example, the case of Russia which you pointed out, the US buying gas from countries with questionable human rights violation records which can contradict the democratic values that the government has been promoting. Therefore, buying cheaper gas might provide short-term benefits for consumers, but it doesn’t match values such as democracy, human rights, and environmental prejudice.

      If we look at the bigger picture, the lasting issue of consumers buying products from companies without even researching their background is growing. Many people nowadays are unaware of the things they are buying, and who they are burying them from. Subsequently, when consumers are providing profits for problematic companies, they can encourage exploitation of employees. For instance, the incident of Nike companies using child labor in their supply chains. These children were often subjected to long hours of work in dangerous conditions, and were given very low wages. But they also have limited access to proper education and healthcare. No matter how these reports about Nike were all over the media, a great number of people are still buying products from Nike. Indeed, a great number of them have no knowledge of Nike companies exploiting children. However, Nike is just one of the companies that has committed human rights abuse. There are still many more US companies that have dealt with the problem.

      Exploitation can happen when people blindly pursue lower priced products from companies, without conducting research or considering the ethics behind. They might unknowingly support businesses that engage in exploitative behaviors and unfair treatment. Consequently, companies may be less likely to be transparent about their supply chains. Since the consumer does not demand such information, in other words, this might lead to more exploitation in the workplace. It should be brought to attention that blindly pursuing lower priced products can lead to consequences. Whereas we have to ask ourselves, is taking advantage of lower priced products really worth it? When is this phenomenon creating a manipulative workplace?

      People nowadays really have to learn to think before they buy anything. A lower priced product doesn’t come from nowhere-a company can’t let itself lose profits. So, when you think you were lucky to get the product at a low price, the companies think so too, when they are exploiting their employees for a greater profit. Accordingly, people have to learn empathy for the victims under the effect of manipulation and exploitation. Because you can never assume that you won’t be the victim of a human rights violation. Therefore, people shouldn’t be blindly pursuing benefits for themselves, and ignore the consequences it can have on other people. People have to understand the broader consequences it can have. We have to help the victims instead of providing more profit for the problematic companies. Never assume the abuse is too far from you, because one day, everyone could turn out to be a victim of human rights violation.

  3. I would like to wholeheartedly congratulate Karen, Brian, Jennifer, and Christine for igniting the “aspiring data scientist” in a fellow high school student. Upon encountering this study on factors influencing gasoline prices, I have decided to utilize the advice of Professor Linda Zhao. In her words, to solve a real-world problem, one shall collect data, perform analyses, and present an intriguing story. Zhao’s students uncovered economics to be a major factor in the wavering gasoline prices, and my findings on political influences have confirmed their theory.

    As stated by the astute students of Professor Zhao’s summer program, the Russia-Ukraine War has played a drastic role in U.S. retail gas prices. This political situation has influenced the quality of many lives, prompting me to do research of my own on the war in my country.

    I first noticed the fluctuations in oil prices on a warm day on the road last summer in Moscow. I had never paid attention to gasoline price signs until then, as I had just begun driving. The sign read fifty-nine rubles for a liter of oil (confirmed by At the currency exchange rate during June of 2022, that number equals $3.63 a gallon for gas (thank you AP Physics for teaching me unit conversions). At that time, oil cost $4.92 per gallon in the US. Clearly, gasoline costs much higher in the states, although the US beats Russia in yearly oil production by ten million barrels per day (US Energy Information Administration). The findings of this Wharton study disclosed that the consumption of oil in the US is much higher than its production. Russia’s world share of oil consumption is 3.7%, compared to 20.3% by the US (Worldometer). The link between consumption and production explains the elevated oil prices in the US.

    With my curious mind, I proceeded to question my father on this subject. As June of 2022 was the political peak of the Russia-Ukraine war, was this cost of fifty-nine rubles per liter higher than usual? As per my father’s experience, it was… but I still decided to confirm his opinions. In previous months, gas there costs $2.50 per gallon — a dollar less than in June. As it turns out, during the war, gas prices didn’t only increase in the US and Europe, but in Russia too. This was a perplexing discovery, as I knew Russia was a leading country in oil production. How had the embargo on oil exports effect internal Russian energy production?

    Russian oil accounts for eleven percent of the total global supply (U.S. Energy Information Administration). As per statements of the Environmental Defense Fund, the financial sanctions placed on Russia by Western countries made it difficult for Russia to clear transactions on oil exports. In my eyes, the sanctions in Russia had simply shut down Zara and caused Starbucks to rebrand as Stars Coffee. However, the sanctions also affected economic markets across the globe — especially the gasoline trade.

    Politics play a significant aspect in global markets. As described, the sanctions restricted the US from importing Russian oil. This caused a gap in supply, which increased prices as demand stayed constant. However, the Wharton Youth students uncovered that the majority of US gas comes from Canada. How, then, would Russia’s removal from the equation increase gas prices so drastically?

    As Brian L. stated during his presentation, as more oil is imported into the US, transportation costs increase gas prices. This poses an essential question to be solved by our generation: How can we increase oil production within the US in order to decrease oil import transactions? Essentially, doing so would terminate the backlash caused by placing sanctions on Russia.

    As a somehow licensed driver, I am a constant consumer of the gasoline market. I noticed the fluctuating prices both in Russia and in the States, which clearly caused my curiosity to churn. In college, I will be investigating global economic markets. The research of these Wharton Youth students indicates that there are internal solutions to these oil fluctuations. It is more complicated than simply increasing US oil production, but as per the advice of Linda Zhao, I have presented my data. It is now up to you, my fellow reader, to sit back and relax while I make the political depth of this matter known.

  4. When I think of storytelling, it’s not just plain old printed books. I think of the cutscenes in games, the film reels playing movies, and actors taking up the stage, but I also think of perhaps less conventional ways of storytelling, which includes, of course, the story data tells.

    As Linda Zhao put it masterfully in the article, “You want to solve real-world problems? Get the data, make sure you have clean data, do the analysis, and make a story out of it. It has to make sense and you have to be able to present it beautifully to a group of people.”

    As an aspiring data scientist, I’ve always been fascinated with data and statistics. The way that numbers, set in a certain way, a certain format, offer up a wide breadth of information. The way data can be manipulated to look a certain way, despite not being the full truth. Even the way data is collected, very easily or horribly complicated, can tell its own story.

    Most of all, “it has to make sense.” What is a story if no one gets it? What is a set of data if it’s messy and unclear? What purpose does it serve if it doesn’t tell?

    Once again, as Zhao said, data is simply a tool for us to solve real world problems. We get the data not to appreciate pretty graphs (not that we can’t do that), but to show us the nooks and crannies of the problem and to provide a solution.

    How did the team in the article converge on the factors raising gas prices? They looked at the data. They checked every variable under the sun that could’ve been a factor. They compiled it, and then they presented it. Data tells us the story – be it about rising gas prices or barbenheimer trends – and it proceeds to tell others that story in a way no language could hope to emulate. Only from there, is it possible to form an effective solution.

    In a way, the most effective story may not have come from language, but from numbers, still formatted in percentages that resemble sentences and graphs that equal paragraphs.

Leave a Reply

Your email address will not be published. Required fields are marked *