1. Business Understanding
There are multiple platforms for cryptocurrency-backed loans where you provide crypto as collateral and in return, you receive a loan in the form of another token. For such scenarios, there are present multiple solutions such as Aave and Compound. The problem grows exponentially when it comes to non-fungible tokens (NFTs). Up until now, a lending provider for NFTs does not exist due to its volatility in interest, trends and social media influence.
What is an NFT?
NFT stands for ‘non-fungible token’. When something is fungible, like a dollar bill, it is equivalent to, and can thus be exchanged for, any other dollar bill. In contrast, a non-fungible token is a unique asset in digital form that cannot be exchanged for any other NFT. This means that every NFT is a ‘one-of-a-kind’ item. NFTs are transferred from one owner to another using blockchain technology, which creates a digital trail from seller to buyer that verifies the transaction. This encodes the unique ownership rights to the buyer (new owner).
The market price of a unique NFT is not easily observable. We want to predict it.
2. Data Understanding
There are two datasets that we are using. The first one contains NFT sales:
We have the NFT collection and token id, sender’s address, receiver’s address, amount and currency.
And the second one contains all traits of each NFT:
|azuki||0xed5af388653567af2f388e6224dc7c4b3241c544||0||https://ikzttp.mypinata.cloud/ipfs/QmQFkLSQysj94s5GvTHPyzTxrawwtjgiiYS2TBLgrvw8CW/0||Clothing||Pink Oversized Kimono|
|azuki||0xed5af388653567af2f388e6224dc7c4b3241c544||0||https://ikzttp.mypinata.cloud/ipfs/QmQFkLSQysj94s5GvTHPyzTxrawwtjgiiYS2TBLgrvw8CW/0||Offhand||Monkey King Staff|
|azuki||0xed5af388653567af2f388e6224dc7c4b3241c544||0||https://ikzttp.mypinata.cloud/ipfs/QmQFkLSQysj94s5GvTHPyzTxrawwtjgiiYS2TBLgrvw8CW/0||Background||Off White A|
3. The solution
We want to predict the price of an NFT based on the market prices and sale history of similar NFTs.
Finding similar NFTs
To find similar NFTs on which we base our prediction we use clustering. We cluster all NFTs by traits. We tried KMeans and GMM.
This is the elbow graph for KMeans. We can see that the optimal value for k looks to be 100 but the error is very high.
And here it is for GMM. We can see that by increasing the number of clusters the error gets higher. This is due to the fact that the data probably isn’t easily clusterable and that BIC penalizes the addition of more clusters.
With clustering we can also predict confidence for which collection the NFT would be in – we find its cluster. The confidence % for each collection is equal to the % of all the NFTs from that collection in that cluster.
Since in this context outliers are very important we split our data into outliers and non-outliers for clustering, so essentially we have separate clusters for regulars and outliers. We define if it’s an outlier or not based on it’s price – if it’s way above the average price of all NFTs.
** TODO: Add distribution of NTF transaction prices **
Predicting the price
In order to predict the price of an NFT we:
- Find the cluster it is closest to (either a regular or an outlier one).
- We get the % score of each collection & determine the most likely collection
- We get all NFTs from that collection
- Predict the price based on the prices of all NFTs above mentioned
For predicting the price we can:
- Simply average
- Do a weighted average based on how easy the sale of the NFT was. NFTs that sold more often in the last 30 days have a higher weight than those that sold less often.
For 2. we can do that either linearly or exponentially.
The above mentioned solution achieves a MAE of: TODO
We compared it to a baseline method of predicting the price – picking a random price out of all the NFT price records.
The baseline method achieves an MAE of: TODO
In order to predict the price of an NFT we:
- Get all latest prices (last trade price) of each token in the last 30 days
- Get the wanted token’s traits
- For each trait find all tokens which share it and get the mean of all their prices
- If a given trait doesn’t have any value drawn from the other tokens it is excluded
- Calculate weighted average of the prices and the rarity of each trait
For example, from collection BAYC token #2121 has the following traits with respective rarity and average price for each cluster of tokens with the same trait:
Background: Blue – 12.42% – 92.91 ETH
Fur: Cream – 6.36% – 84.13 ETH
Clothes: Lab Coat – 1.44% – 109.14 ETH
Mouth: Phoneme Vuh – 3.33% – 83.12 ETH
Eyes: Holographic – 1.51% – 117.25 ETH
Hat: Halo – 3.24% – 96.60 ETH
The resulting price from the calculations is 97.43 ETH for the period Feb 25 2022 – Mar 28 2022
4. Future work
The prices also depend on a lot of other factors. We would like to experiment with them in the following order:
- Take into consideration market data for the prices of cryptocurrencies at the time of each transaction
- Take into consideration data of the owners of NFTs
- Map an NFT with its owner in Twitter, fetch his tweets and engineer some features based on sentiment analysis