Data Collection and Processing
Overview
To explore the relationship between sentiment and mimetic behaviour in the digital assets market, we collected and processed data from two public APIs:
- Alternative.meβs Fear & Greed Index API β for a sentiment index quantifying market mood from Extreme Fear to Extreme Greed.
- CoinGecko API β for daily price, volume, and market capitalisation data of Bitcoin, Ethereum, and Solana.
We then stored the processed data in a relational SQLite database to facilitate structured analysis and reproducibility.
Data Collection
π Market Sentiment (Alternative.me)
The Fear & Greed Index API requires no authentication and provides daily scores representing aggregated market sentiment.
From the API endpoint: https://api.alternative.me/fng/?limit=0&format=json
We extracted:
value(0β100 index)value_classification(Extreme Fear, Fear, Greed, Extreme Greed)timestamp(UNIX format β converted to date)
The data was filtered to include only the most recent 365 days and renamed appropriately (classification, date).
π Cryptocurrency Price and Volume (CoinGecko)
We used the CoinGecko Demo API to fetch:
- Daily OHLC prices (open, high, low, close)
- Market capitalisation
- 24-hour trading volume
Coins tracked:
- Bitcoin (BTC)
- Ethereum (ETH)
- Solana (SOL)
Each coinβs data was collected for the last 365 days using Pythonβs requests library. We respected rate limits using a time.sleep(1.5) delay between API calls.
We made use of two endpoints:
/coins/{coin_id}/ohlc/coins/{coin_id}/market_chart
Data was normalised into a pandas DataFrame per coin, with timestamps converted to standard datetime format, and then concatenated into a unified price_data table.
Data Processing Pipeline
π Cleaning & Merging
Once collected, all raw data was processed using pandas to:
- Convert timestamps
- Drop unused columns
- Handle missing values where necessary
- Sort and reset DataFrame indices
- Align timeframes between price and sentiment datasets
π API Key Security
For CoinGecko authentication, we loaded the Demo API key using the dotenv package to keep credentials hidden from version control:
DEMO_API_KEY = os.getenv("COINGECKO_DEMO_API_KEY")
π§± Database Schema Design
To store and structure our data efficiently, we implemented a relational SQLite database with three interlinked tables: coin_metadata, price_data, and sentiment_data.
This schema ensures proper normalisation and supports robust, query-friendly analysis.
| Table | Description |
|---|---|
coin_metadata |
Contains unique identifiers, names, and symbols for each cryptocurrency. Serves as a reference table to avoid redundant repetition of coin attributes across records. |
price_data |
Stores daily observations for each coin, including closing price, trading volume, and market capitalisation. Uses a composite primary key (date, coin_id) and links to coin_metadata via a foreign key. |
sentiment_data |
Stores the daily Fear & Greed Index scores and associated classifications. Keyed by date, allowing it to be joined directly to price_data on the same date. |
