A comprehensive framework for cryptocurrency quantitative trading, featuring data collection, preprocessing, model training, signal generation, and backtesting capabilities.
- Clone the repository:
git clone [repository-url]
cd toolz
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env with your API keys and configuration
- Initialize the database:
python -m src.models.database
toolz/
├── src/
│ ├── data_collection/ # Data collection from various sources
│ ├── preprocessing/ # Data cleaning and feature engineering
│ ├── models/ # Trading models and strategies
│ ├── signals/ # Signal generation logic
│ ├── backtesting/ # Backtesting framework
│ ├── risk_management/ # Risk management tools
│ └── utils/ # Utility functions
├── examples/ # Example scripts and notebooks
├── tests/ # Unit and integration tests
├── data/ # Data storage
└── docs/ # Documentation
- Collect historical data:
from src.data_collection.pipeline import DataPipeline
pipeline = DataPipeline(symbols=['BTC-USDT-PERP'], timeframe='1h')
pipeline.collect_and_store_data(
data_types=["candles"],
since="2024-01-01",
until="2024-02-01"
)
- Preprocess data and engineer features:
from src.preprocessing.data_processor import DataProcessor
from src.preprocessing.feature_engineering import FeatureEngineer
# Initialize processors
processor = DataProcessor()
engineer = FeatureEngineer()
# Process data
processed_data = processor.process(raw_data)
features = engineer.calculate_features(processed_data)
- Train and evaluate a model:
from src.models.statistical.ornstein_uhlenbeck import OrnsteinUhlenbeckModel
model = OrnsteinUhlenbeckModel(
features=[],
target='price',
lookback_period=30
)
model.train(data)
signals = model.predict(data)
- Supports multiple data sources and types
- Historical and real-time data collection
- Automated data pipeline with scheduling
- Data validation and quality checks
The framework includes a powerful OHLCV data collector for Binance Futures:
# Basic usage
python src/data_collection/ohlcv_collector.py btcusdt,ethusdt 1h 2024-01-01 2024-01-02
# Advanced usage with options
python src/data_collection/ohlcv_collector.py btcusdt,ethusdt 1h 2024-01-01 2024-01-02 \
--output-dir custom/path \
--format parquet \
--batch-size 500 \
--concurrent \
--verbose
Collect historical and real-time funding rates:
# Historical funding rates
python src/data_collection/funding_collector.py btcusdt,ethusdt \
--start-date 2024-01-01 \
--end-date 2024-01-02 \
--output-dir data/funding \
--format csv
# Real-time funding rates
python src/data_collection/funding_collector.py btcusdt,ethusdt \
--realtime \
--output-dir data/funding \
--verbose
Monitor real-time liquidation events across all trading pairs:
# Real-time liquidation monitoring
python src/data_collection/liquidation_collector.py \
--output-dir data/liquidations \
--format csv \
--verbose
Collect historical and real-time volume data:
# Historical volume data
python src/data_collection/volume_collector.py btcusdt,ethusdt \
--timeframe 1h \
--start-date 2024-01-01 \
--end-date 2024-01-02 \
--output-dir data/volume \
--format csv
# Real-time 24h rolling volume
python src/data_collection/volume_collector.py btcusdt,ethusdt \
--realtime \
--output-dir data/volume \
--verbose
Features for all collectors:
- Multiple symbols support with concurrent downloading
- Both historical and real-time data collection where applicable
- CSV and Parquet output formats
- Configurable batch size and retry logic
- Progress tracking and detailed logging
- UTC timestamps and proper data type handling
- Automatic data directory management
- WebSocket support for real-time data
- Robust error handling and recovery
- Data cleaning and normalization
- Feature engineering with 100+ technical indicators
- Custom feature creation
- Time series preprocessing
Available models include:
- Statistical Models (Mean Reversion, Momentum)
- Machine Learning Models (XGBoost, Neural Networks)
- Time Series Models (ARIMA, GARCH)
- Custom model implementation support
- Multiple signal generation strategies:
- Momentum signals
- Trend signals
- Volatility signals
- Composite signals
- Signal combination and filtering with customizable weights
- Position sizing integration
- Risk management rules
- Signal quality metrics and validation
- Event-driven backtesting engine
- Performance metrics calculation
- Risk analysis tools
- Transaction cost modeling
The examples/
directory contains various example scripts:
historical_data_example.py
: Data collection demofeature_engineering_example.py
: Feature creationmean_reversion_example.py
: Basic trading strategyenhanced_mean_reversion.py
: Advanced strategycrypto_pairs_trading_example.py
: Pairs tradingmomentum_strategy.py
: Momentum-based tradingtime_series_forecasting_example.py
: Time series prediction- And more...
- Implement real-time data streaming
- Add OHLCV data collection for Binance Futures
- Add more machine learning models
- Enhance risk management module
- Create web dashboard for monitoring
- Add portfolio optimization tools
- Implement automated trading execution
- Add more documentation and tutorials
- Create configuration management system
- Implement performance reporting tools
- Add market regime detection
- Enhance backtesting visualization
- Add cross-validation framework
- Implement position sizing optimization