Datathon cases

Datathon 2017 Shopup Case – Hack the WIFI

0
votes

This is the first Dathaton in Bulgaria during 24-26 March 2017 in Sofia. Here is one of the cases for Shopup.

Hack the WIFI

 

About ShopUp

ShopUp is a startup located in Bulgaria with several customers in Canada and Bulgaria.

We support Malls and Shopping Centers in their operations and marketing with easy to install SaaS analytics platform combining all their IoT devices data – Wifi Routers, Door counters, Parking sensors, external data sources and Mobile apps for deep and detailed Customer Behavior Analysis with coverage of 99% of users. We combine high accuracy, math models and deep profiling into one easy to install solution.

With our analytics we helps Malls and Shopping centers operate more efficiently, know their customers and improve their marketing spending.

The Machine learning algorithms are used for better profiling and zones localization.

Background

Each mobile phone broadcast information about the WiFi networks where it was connected.

ShopUp solution gathers and that kind of information into their engine and never explore it further.

We decided to present a real case, where we want to see what can be done in order to use the name of the WiFi network and identify the place where that phone was.

The input data is limited, but there are a lot of external APIs around us and different machine learning algorithms can be used for filtering and matching.

Challenge

You’re given a real dataset of more than 100 000 records of WiFi networks captured by several routers, with information about the timestamp, unique phone ID and Name of WIFI network (SSID).

The main goal is to automatically enrich the information about the Wifi network.

The participants can use different API, Semantics or Machine learning algorithms to associate Wifi Network name with particular company, venues, places or personal networks and provide more information about the type of the entity, possible physical address, company identity, social presence and other details.

There are no limitation about what kind of technique or sources of data can be used it is a hacking task.

For example:

Network id: “H&M Free WiFi”
Possible results:
Business entity: Yes
Industry: Retails

Type of Business: Fashion

Company: H&M Bulgaria

Address: Sofia …

Location: Boulevard “Bulgaria” 69, 1404 Sofia
Site: http://www2.hm.com/bg_bg/index.html

Facebook:https://www.facebook.com/hmbulgaria

Social:

Feedback:

Etc …

 

Network id: “YOWO-BAR”
Possible results:
Business entity: Yes
Industry: Bars

Type of Business: Restaurant

 

Company: YoWo

Address: Sofia, Bulgaria

Location: Bulevard Bratya Bakston 83
Site: no

email:[email protected]

Facebook:https://www.facebook.com/YowoZone/

Social:

Feedback:

Etc …

You’re given a dataset of gathered WiFi Names, described in a table below. Each record is unique and some of the Networks names can be repeated several times from different Unique mobile IDs.

Dataset

It’s located in the file `ssids.csv` and there is an sample data in “ssids-part.csv”. They contains the following columns:

| Fields                     | Description |

|—————————-|————-|

| `time_stamp`               | The time the record was done |

| `unique_mobile_id`         | Each mobile device is having personal identifier|

| `ssid`                     | Name of the WIFI network |

Additional sources

There are many sources of data. This list can be used as a reference only, some of them are with freemium package :

  • Google API
  • Foursquare
  • Yellow pages
  • Facebook
  • Wigle
  • Geomena
  • Combain
  • Openbmap
  • Openwifi
  • Open street map

It is recommended to follow the CRISP-DM methodology that can be viewed here.

In order to start developing a solution you may make a clone of the project on GitLab.

Share this

Leave a Reply