Discovering the Power of Data to Predict Forest Fires

by Diana Drake

Fire season on the West Coast of the U.S. has been nothing short of disastrous in 2020. The numbers are staggering. More than 5 million acres combined have burned in California, Oregon and Washington so far in some of the largest fires ever recorded. By all accounts, the experts say that since the summer, 2020 is the most active fire year on record for the West Coast – ever.

And the flames rage on.

“The El Dorado Fire, which was caused by [pyrotechnics] from a gender-reveal party, is 15 miles from my house in California,” says Emily Fu, an undergraduate in business analytics at the Wharton School who is now back in Philadelphia for her senior year. “For us, it has become such a normal thing. Every couple of years the mountain catches fire and if it’s really bad, we’ll evacuate. I call home pretty often and my family didn’t even mention it. I asked my dad, ‘How close is the fire?’ He sent me a video and said, ‘Oh, you can see the flames.’”

A Little Modern Data Mining

While the wildfires of recent years may no longer stoke Emily’s anxiety, the numbers rising out of the smoke and ash have ignited her curiosity.

Emily and her classmates Zhun Yan Chang and Melisa Lee, also Wharton seniors, began digging deeply into wildfire data last year as students in Prof. Linda Zhao’s Modern Data Mining class at Wharton. Data mining is the practice of examining large databases – for instance “Wildfires in the U.S. for the Past 30 Years” – in order to generate new insights.

Social impact should be at the center of your data project.” — Melisa Lee, Wharton Student

Inspired by Emily’s personal connection to the fires and their collective fascination with data analytics, the team set out to answer a specific question: “Given its features, can we predict the size of a fire?”

The team ultimately created a data model, a descriptive diagram of relationships between various types of information, for their class project that they hope may someday be used to prevent forest fires in California. Their work earned them a spot among the industry professionals, Ph.D students and professors presenting at the Women in Data Science Conference at the University of Pennsylvania in February 2020.

They were excited to tell their fire-research story, inspired by weeks of data collecting and analysis. “We found out that a very small percentage of fires, less than 1%, cause 80% of the destruction. That told us if we can prevent that less than 1% and focus on stopping those fires, we can cut down on a big portion of the destruction,” notes Emily. “So, what’s causing those fires? Most of these are fires from lightning striking a tree or forest…When we ran the model, it spit out some specific vegetation types that were predictors in how big a fire was — like timber litter or dead branches that fall on the ground and are dry. They are causing these large fires. Our storyline: These lightning fires are the most destructive, they spike up in the summer, and are also linked with vegetation.”

Plans to share their findings with the California government have been delayed by the pandemic. Still, the three data scientists-in-training were eager to pass along some takeaways from their research process when Wharton Global Youth checked in with them last week.

Fruit Trees or Lightning?

First, data science demands you to extract the most compelling story behind the numbers. You can build models and encode things, and suddenly you’re left with 30 variables that, for example, influence fire size. Finding the commonalities between all these things? That’s up to you. And arriving at the best conclusions will take time. “Once you get the proper data, you shouldn’t straight out build a model,” says Zhun, a finance major from Malaysia. “You should look at the data, figure out how it is going to fit into the big picture and the impacts of each variable. Use data analysis, simple charts and visualizations to have an idea of what your final results should be.”

Then you begin to shape the story that you want to tell. “Our team could have told many stories with our data,” notes Melisa, who is pursuing a career in marketing and business analytics. “An alternate story was that the vegetation type around orchards is also very flammable because of the fruits. But that story would not have driven as much impact or incited people to act as much as our lightning story. Social impact should be at the center of your data project. Building a model should be for some ultimate purpose to let people know that something is happening and you have the data to back it up.”

Emily, Melisa and Zhun are convinced that data has immense power (did you know it has replaced oil as the world’s most valuable resource?). They plan to spend part of their senior year working to get their project into the hands of decision makers who are confronting the worsening West Coast wildfires.

“Our model highlighted a lot of high-risk areas that we can target. We can target these by using prescribed burns, which are smaller controlled fires that will burn up all the timber litter by summer so that once fire season comes along, there’s not much to catch on fire,” says Emily. “We want our model to motivate the California State Department to focus on having more prescribed burns, but also to make sure the ones we do have are well targeted so we’re preventing the larger fires going forward.”

Related Links

Conversation Starters

Why is the story behind the numbers so important to data analytics? How did Emily, Melisa and Zhun shape their findings into a story?

How do you describe the power of data? Why is it such a valuable and important resource worldwide?

Do you take statistics or data analytics classes in high school? Are they focused only on the numbers or do they provide context to the real world? What are some examples of how data projects have become more relevant for you? Describe one in the comment section of this article.

One comment on “Discovering the Power of Data to Predict Forest Fires

  1. Emily and her team inspired me to open my eyes and realize that data analysis can be applied to prevent not only forest fires, but also a variety of other critical environmental issues. Learning about the power of data that helped Emily and her team resolve the problem of forest fires provoked an interesting idea in my head. “Why don’t I use data analysis to solve the environmental issues in my own country, South Korea?”

    Though I was born in South Korea, I spent most of my childhood in Germany because of my dad’s business. When my father told me and my mom that we were finally heading back to Korea, I was eager to go back to my home country. From all the nostalgic memories of Korea, I especially missed going to the summer house that my family owned in the countryside of South Korea called Hongcheon. I still remember all the good memories I had there, such as catching fish in the clear lakes, stargazing at night, scavenging for wild fruits, and hiking up and down the hills. However, when I finally arrived in Hongcheon, it did not take me a long time to realize that many things in Hongcheon have changed since I have left Korea. The first thing that caught my eyes was the massive chemical factories lined up throughout Hongcheon lake. I couldn’t catch sight of any fish in the lakes, the night sky was foggy with no signs of stars, the places where I scavenged for wild fruits were replaced with construction sites.

    Yes, I have listened to Greta Thunberg’s inspiring speech at the UN Climate Action Summit and watched multiple documentaries promoting sustainable growth. However, I have always been neglecting to take care of the environmental pollution around the world as I thought it would not be a problem that will affect my daily life. After witnessing the changes in Hongcheon around my weekend house in 9th grade, I learned how immature and ignorant I was towards greater societal issues. After realizing the insufficient part of me, I was motivated to take small steps and put them into action. Researching different environmental organizations, I started to donate 20$ (a money that past myself would have carelessly spent on purchasing new clothes) monthly to the Korean Federation for Environmental Movements(KFEM), an organization that adopted policies to abate GHG emission rates and invigorate and expand the use of renewable energy to reduce the number of greenhouse gases.

    However, I always felt that monthly donations are not enough to actually help sustain the environment, and the answer was in the article: use data analysis to make a sustainable world. Knowing how much Hongcheon has changed from my childhood memory, I wanted to alert the people of what I have seen and noticed through presenting data evidence. I started to collect climate information about Hongcheon from Korean national weather center and calculated the difference in average temperature level in July from 2010 till now. I found out that the average day temperature rose from 32.5° to 35.1° just in 10 years, meaning that by 2050, the average day temperature of Hongcheon will rise to 42.9°.

    After a few days, I conducted another data analysis experiment with a pH Sensor that I borrowed from Mr. Hershfield, my chemistry teacher. I used it to measure how polluted the Hongcheon lake is by detecting the acidity of the water at the top and the bottom of the stream. I first hiked up the hill to measure the acidity of the water at the high point of the stream and found out that the pH level was 6.8. Then, I hiked back down to measure the low end of the stream, and while descending, I saw various colossal garbage disposal sites and factories near the bottom stream of the lake. As expected, the resulting pH level of water at the bottom end of the stream was 5.5. With the series of data analysis that I conducted a week ago on how sustainability of Hongcheon is at stake, I emailed the Hongcheon County Office to alert the officials about this situation.

    Furthermore, with Alex, or my geek friend who knows much more about data analysis, I decided to create a club called Green Up that focuses on saving our environment through promoting the use of more renewable energy, making donations to widely known environmental organizations, reducing the use of plastic, and much more. With the small but powerful steps that I will take from now on, I hope that I could inspire other members of generation Z to use data analysis to tackle important social issues that are yet overlooked.

Leave a Reply

Your email address will not be published.