Datathons SolutionsLearnTeam solutions

ACADEMIA DATATHON CASE: THE A.I. CRYPTO TRADER

Our team Hack Alternative has been assigned to work upon making predictions on Cryptocurrency.
We were provided data on cryptocurrency in a CSV file named Price_Data .
We took data from the same from 18th Jan 2018 to 24 Jan 2018 to predict the next day’s cryptocurrency.
We used 20 columns of cryptocurrencies taking 5 days each.
Then we imputed missing valuesin the data using imputeTS package.
Used Neural Network Model to forecast the future cryptocurrency value.
Lastly we used looping construct to generate 100 CSV files consisting of 20 cryptocurrencies with 5 days each.

3
votes

Crypto_pred.R

TEAM MENTOR:

TEAM MEMBERS:

TEAM TOOLSET:

  • R
  • Microsoft Excel

BUSINESS UNDERSTANDING:

The goal is to build a successful investing/trading model on the cryptocurrency markets. The data consists of time-series of various cryptocurrencies (in 5-minute steps) prices and 24 hour volumes.

Data URL: https://fdibatusofia-my.sharepoint.com/:f:/g/personal/pnikolov_fdiba_tu-sofia_bg/Ekd5mSsInSZFplOvh4VbjKMBxrsEDszGrtnccmyzUOfMkA?e=7Wxnsv

DATA UNDERSTANDING:

  • Data provided in csv format
  • Exploring dataset’s structure
  • Identifying discrepancies in the dataset
  • Identifying the subsets of data that modelling would be based on

DATA PREPARATION:

  1. Raw Data:
  • Price_data à Data type: Time Series à Data Format: CSV
  • Data have more than 1600 cryptocurrencies with transaction details
  • Data provided between 17th JAN 2018 11:25 to 24th MAR 2018 13:15
  1. Working:
  • Missing values in the dataset were identified and were imputed using “imputeTS” package.
  • Converted the data to a time series by giving the frequency as 24 hours multiplied by 5 minutes.
  • Variable reduction was done
  • Here we have splited data base on traning and testing data set
  • We have used here the neural network model
  • By using forecast package we use the nnetar() function for forecasting time series
  • This model makes the data stationary and then forecast it based on observation.
  1. Evaluation:
  • Divided the data into two parts namely training and test data set.
  • Fed the model with the train data set that helped the model to learn something about the data.
  • Predicted the values present in the train data set.
  • Compared the Predicted values with the Observed values and calculated the RMSE( Root Mean Square Error).
  • Lastly we have selected the model which has the least RMSE Value and higher Accuracy.
  1. Deployment:
  • Ran the algorithm on all the test data and created multiple CSV files with the required structure.
  • Also did a manual check on the results with lowest confidence score to detect errors (and, as expected, found some which we manually corrected).

Share this

Leave a Reply