πŸ“ DS105A Final Group Project: Turkey, Duck, and Fish πŸ“


✍️ Team Members ✍️

  • Chaoyang Feng (oche32) | BSc in Politics and Data Science
  • Kylin Gao (gaoonline) | BSc in Economics
  • Anka Uysal (ankauysal)| BSc in Economics
  • Sissi Wang (Huanxi-Wang) | BA in History

✈️ Overview of Analysis ✈️

Flights are an essential part of many people’s lives. No matter what you do for a living or where you are from in the world, it is likely that you need to travel by plane at some point. Perhaps you are an international student (like us) going back and forth to univerity, an employee travelling for work, an expat visiting your family in your home country, or simply a travel enthusiast observing the world with friends… If you fall into one of these categories, it is likely that you are interested in booking the flight that is the most efficient for you. As people who fly quite frequently, we were motivated to analyze which flights are the best in terms of minimum delays. Join us here on our page to read more about our analysis regarding flight delays. We have analyzed flight delay data from over 700,000 flights to help everyone find the most optimal flight!


βš™οΈ The Data βš™οΈ

⏳ Procedure Map ⏳

Here is a procedure map that provides an overview of our data sourding and data collection procedure:

πŸ“Š Data Source and Data Collection Challenges πŸ“Š

In our analysis, we have used flight delay data from the Aviyair API, which is available through subscription. This API has provided us with historical data, allowing us to analyze dense flight delay data from August 31 - September 9, 2023. The data contains more than 700,000 flights (732,880 to be exact) from a wide range of airports, including LAX, ATL, and JFK, and airlines, including Delta, Air France, and KLM. This extensive data set will allow us to make an accurate prediction regarding where, when, and with which airlines delays occur the most.

It must be recognized here that while there is much information readily available on the internet regarding current delays on flights, it is very hard to find organized historical flight data, which we need to conduct out analysis. In our data sourcing process, we have tried and failed to find a proper data source many times before suceeding with the Aviyair API. We have tried collecting data through many APIs, the most extensive ones being the Aviationstack API and the Airlabs API. While some APIs have allowed us to collect insufficient (either due to information or volume), some we have not been able to access since many are only available through a paid subscription. We have essentially decided to collect data through the Aviyair API through paying a basic subscription. Our data has allowed us to bring this analysis together and present our findings coherently.

πŸ”¬ Final Data πŸ”¬

Here is our final data frame!


πŸ“ˆ The Exploratory Data Analysis πŸ“ˆ

Based on the data we collected, here are the visualizations and our concluding analysis:

Analysis 1.1

Explanation:

Analysis 1.2

Explanation:

Analysis 1.3

Explanation:

Analysis 1.4

Explanation:

Analysis 2

delays on world map

Explanation:

Analysis 3

Explanation:

Analysis 4


πŸ“ Conclusion πŸ“

There are many reasons that flights can be delayed and these reasons can interact even further to impact flight delay times. Our conclusions show that xxx. While it is a limitation that the uncertain nature of flight delays could mean that outliers are likely, our analysis proves that xxx. This information can help you book your next flight to be the most time efficient it can be!


References

  • ChatGPT (More information can be found in the ChatGPT usage report.)
  • Aviyair API (Available through subscription.)