Abstract¶Viewing the file we can explore some of the effects on the GPA (grape point average) by the other variables. Considering the short timespan for this project we will not delve deeper into the data. The file will be read, some columns and values are going to be removed, correlations are going to be measured, […]
Abstract¶We would like to see if there is any connection between the products (names) and price, as well as existing patterns. This is set a-priori. When we do the exploration further question will arise. Some of the data will be removed as it will not be used. There will be plots, groupings and hypothesis testing […]
Abstract¶When we first look at the file we can see that it is the biggest one of all, most of the data will not be used, we will create samples. Lets set an a priory goal to see if there is a connection between the countries and some of the ingredients that don’t have that […]
The project tries to create a model based on data provided by the World Health Organization (WHO) to evaluate the life expectancy for different countries in years. The data offers a timeframe from 2000 to 2015. The data originates from here: https://www.kaggle.com/kumarajarshi/life-expectancy-who/data The output algorithms have been used to test if they can maintain their accuracy in predicting the life expectancy for data they haven’t been trained. Four algorithms have been used:
Linear Regression with Polynomic features
Decision Tree Regression
Random Forest Regression
This notebook is a basic introduction into Stochastic Processes. It is meant for the general reader that is not very math savvy, like the course participants in the Math Concepts for Developers in SoftUni.
There is a basic definition. Some examples of the most popular types of processes like Random Walk, Brownian Motion or Weiner Process, Poisson Process and Markov chains have been given. Their basic characteristics and examples for some possible applications are stated. For all the examples there are simulations in Python, some are visualized.
The following packages have been used:
You must be a registered user for the #AcademiaDatathon to see this content.
In an attempt to make a case which is to be somewhat universally understandable by various types of students, the case is financial time-series prediction, while making it more engaging with the hot topic of cryptocurrencies. The case integrates knowledge from various sources – Crypto Currencies, Quantitative Finance and Machine learning. At the same time, the case is stratified as the teams solving it could complete various levels – as far as they could solve it.
Find out ALL sources of meat that have been used to produce a certain sausage and ALL organisms (or traces of them) that should not be found in the sausage (pathogens, bugs, even human).
ReceiptBank will provide an extensive dataset of invoices hidden inside PDF files, which you can uncover by developing an algorithm that detects how many documents are contained in each PDF file.
Telenor gives you a rare chance to do social network analysis on the best kind of data set for this – telecom data.
Do you know what KFC, Pizza Hut and Taco Bell share in common? The same parent company – ‘Yum! Brands’. Find out about this and other secret relationships in the business world by building an NLP algorithm that deduces the parent company from text.
Have you ever wondered if retailers actually make more money from price discounts? Or what is the impact of promotions in Amazon on the sales in Walmart? The SAP case is about price optimization and promotional effectiveness.