Data Science Challenges in Travel – Skyscanner


When and where can you get the best price for your travel plans? And, why are there different flight prices? Why is creating a meta-search engine for flights one of the hardest problems? In the peak of the travelling summer frenzy, we tried to find an answer to these questions together with two software engineers from the Sofia office of Skyscanner. The office of the search site that serves over 40 million unique visitors every month opened in October 2014 and is growing quickly. Our speakers were their Principal Software Engineer  Plamen Aleksandrov and Konstantin Halachev, PhD.

Konstantin Halachev graduated the Sofia University, before moving to the Max Planck Institute for Informatics in Germany to do a PhD in bioinformatics. He then briefly worked on personalization in e-commerce, before accepting the challenge to play with Skyscanner’s data in their newest office in Sofia.

Plamen Aleksandrov also graduated from Sofia University, and specialised in Meta-search Heuristics for Discrete Optimization in JKU Linz, Austria. He later worked on a flight search engine and discovered the nitty-gritty specifics and complexities of the Airline Distribution Industry, before joining Skyscanner in Sofia.

In the room packed with audience in Betahaus, including a group of guest students from the University of Warwick, Konstantin and Plamen first focused on what makes flight search complex. A typical flight meta-search engine such as Skyscanner sends the user query to the websites of airlines and online travel agencies (OTAs). The results are aggregated into a single list and ranked based on the preferences of the user, e.g. the price. The challenge of the problem comes from its sheer dimensionality. For example, there are 10,000 possible ways to fly from San Francisco to Boston. If we constrain our example and search only for an American Airlines round trip flight changing in Chicago and Dallas, there are 25.4 million valid ticket prices, out of billions of combinations. And this is just for a particular airline and route, out of a much larger search space – there are 100,000 flights per day, and 15 million queries per second.

Our speakers revealed that one of the reasons why there are so many options is called variable pricing – the prices change according to demand and seat availability and the airlines offer a portfolio of prices for the same flight. For example, cheap fares may require round trip travel, prohibit non-stop flights and ticket refunds. That’s why it’s almost certain that your flight neighbour paid a different price.

After this introduction, Konstantin and Plamen delved into details about their search engine, Skyscanner. It gets 120 million visits per month and serves 13 million searches per day, and as you can imagine, these numbers result in some really big data – 200 GB zipped data per day (80 TB per year). What makes Skyscanner unique among the popular search engines are two features. First, the destination is flexible – you can get a list of possible destinations from a certain airport. Not only is the destination flexible, but also the departure and return dates.

After that, our speakers presented a few intriguing applications one can do with the data gathered. First, you can see how the price on a certain direct one-way route (say London to Madrid) changes over time – it tends to be more expensive the closer the flight date gets. You can compare that dynamics across airlines, and across days of the week (a flight on Wednesday is cheaper than a flight on Friday). One can also factor in the month of the travel, or combine any of these factors to research the price dynamics further.

Second, airlines may use the data to track their sales, compare them to the competitors and get an idea what routes are searched for the most. And if you play with the data, you can also find the hottest destinations from each airport. For example, the most popular destination from Munich is Bangkok in the winter, July and August. In the spring, June and the autumn, however, London becomes more attractive for Munich travellers. Another application of the data is to create a “deal navigator” that based on a range of dates, a maximum price preference and the planned length of stay may suggest the best destinations for you.

Finally, Konstantin and Plamen demonstrated how the demand for trips to Greece was influenced by yet another debt crisis. In short, it plunged throughout Europe, except in Denmark. Curiously, in 2014 Danish were not so interested to travel to Greece, unlike British, Italians, Austrians, Latvians and Bulgarians.

Take a look at the presentation if you want to learn more.
Ten lucky guests from the audience won a portable battery chargers for mobile phones from Skyscanner. The lecture was followed by networking over a bottle of beer generously provided by Skyscanner as well. Don’t miss our great upcoming events and projects – stay tuned by visiting our website, following our Facebook page, LinkedIn page or following our twitter account.

Share this

Leave a Reply