11 KiB
=€ Complete Class-Balanced Trading System V6
=Ë Overview
Sistem trading berbasis AI yang menggunakan PatchTST (Patch Time Series Transformer) dan PPO (Proximal Policy Optimization) dengan fokus utama pada penyelesaian class imbalance problem yang umum terjadi pada trading algorithms.
<¯ Problem Statement
Sistem trading RL tradisional sering mengalami:
- 99% HOLD actions (passive behavior)
- <1% SELL/BUY actions (severe imbalance)
- Kurangnya diversitas trading decisions
- Poor performance pada different market regimes
Solution Implemented
- Dynamic class weight adjustment (real-time)
- Market regime-aware training (5 market states)
- Minority class amplification (up to 5x rewards)
- Anti-passive behavior mechanisms
- Multi-dimensional reward balancing
<× System Architecture
1ã Data Pipeline (download_and_prepare_class_balanced_data
)
Raw Stock Data ’ Technical Features ’ Scaled Data ’ Sequences
- Input: Yahoo Finance historical data
- Features: Volatility ratio, momentum, volume ratio, price position, etc.
- Output: Time series sequences dengan future returns
2ã Feature Extraction (ClassBalancedPatchTSTFeatureExtractor
)
Sequences ’ Patches ’ Embeddings ’ Transformer ’ Features
- PatchTST Architecture: Patch-based transformer untuk time series
- Parameters: patch_size=8, stride=4, embedding_size=128
- Enhancement: Class attention layer untuk better representation
3ã Class Imbalance Handler (ComprehensiveClassImbalanceHandler
)
Action History ’ Distribution Analysis ’ Weight Calculation ’ Balance Correction
- Market Regimes: BULL_STRONG, BULL_WEAK, SIDEWAYS, BEAR_WEAK, BEAR_STRONG
- Dynamic Weights: Real-time adjustment setiap 10 episodes
- Stratified Training: Market regime-aware episode sampling
4ã Trading Environment (ClassBalancedTradingEnvironment
)
Observation ’ Action ’ Reward Calculation ’ State Update
- Actions: 0=SELL, 1=HOLD, 2=BUY
- Rewards: Directional accuracy + Diversity + Trading performance
- Balancing: Class weights × Regime multipliers
5ã RL Training (train_class_balanced_ppo_agent
)
Environment ’ PPO Agent ’ Training Loop ’ Model Evaluation
- Algorithm: PPO dengan enhanced parameters
- Monitoring: Real-time class balance tracking
- Training: Stratified episode sampling untuk balanced learning
=Ê Key Components Detail
Class Imbalance Detection
# Threshold-based detection
imbalance_detection_threshold: 0.80 # >80% single action = imbalance
severe_imbalance_threshold: 0.90 # >90% = severe imbalance
Dynamic Weight Calculation
# Progressive scaling based on severity
if severe_imbalance:
weights[action] = (1.0 / frequency) ** 1.2 # Stronger correction
else:
weights[action] = (1.0 / frequency) ** 0.8 # Moderate correction
Market Regime Classification
# Based on returns and volatility
if mean_return > threshold * 2 and volatility < vol_threshold:
return "BULL_STRONG"
elif mean_return > threshold:
return "BULL_WEAK"
# ... other regimes
Reward Balancing
# Multi-dimensional balancing
balanced_reward = base_reward × class_weight × regime_multiplier
=' Configuration Parameters
Trading Parameters
symbol
: "BBCA.JK" (Indonesian stock)lookback_days
: 20 (input sequence length)reward_horizon_days
: 3 (reward calculation period)initial_capital
: 100,000
Class Balancing Parameters
imbalance_detection_threshold
: 0.75minority_class_amplification
: 5.0xweight_update_frequency
: 10 episodesmax_consecutive_holds
: 3
Model Parameters
- PatchTST: patch_size=8, embedding_size=128, num_layers=3
- PPO: learning_rate=0.0001, n_steps=2048, batch_size=128
=€ Usage
Basic Usage
from model_patcht import ClassBalancedTradingConfig, run_complete_class_balanced_pipeline
# Create configuration
config = ClassBalancedTradingConfig(
symbol="BBCA.JK",
lookback_days=20,
reward_horizon_days=3,
enable_class_balancing=True,
verbose=True
)
# Run complete pipeline
feature_extractor, ppo_agent, data = run_complete_class_balanced_pipeline(config)
Custom Configuration
config = ClassBalancedTradingConfig(
symbol="TLKM.JK",
lookback_days=30,
reward_horizon_days=5,
# Enhanced class balancing
imbalance_detection_threshold=0.70,
minority_class_amplification=7.0,
weight_update_frequency=5,
# Anti-passive behavior
max_consecutive_holds=2,
consecutive_hold_penalty=0.005,
verbose=True
)
=È Expected Results
Before (Typical RL Trading)
- SELL: ~1%
- HOLD: ~99%
- BUY: ~0%
- Status: L SEVERE IMBALANCE
After (Class-Balanced System)
- SELL: 25-35%
- HOLD: 30-50%
- BUY: 25-35%
- Status: BALANCED
Performance Improvements
- <¯ 3-5x more active trading
- =È Better market prediction accuracy
- = Enforced action diversity
- =° Improved risk-adjusted returns
=
Evaluation Metrics
Action Distribution Analysis
- Overall distribution percentages
- Regime-specific action patterns
- Episode-level consistency
- Balance status classification
Trading Performance
- Total return percentage
- Sharpe ratio calculation
- Maximum drawdown
- Number of profitable trades
Class Balance Monitoring
- Imbalance corrections count
- Weight evolution tracking
- Severe imbalance detection
- Real-time balance status
™ Installation Requirements
# Core dependencies
pip install tsai==0.3.4
pip install stable-baselines3[extra]
pip install gymnasium
pip install scikit-learn pandas numpy torch
pip install yfinance
# Optional for enhanced features
pip install plotly # for visualization
pip install wandb # for experiment tracking
<¯ Enhancement Opportunities
1. Model Architecture
- Multi-scale PatchTST: Different patch sizes untuk various patterns
- Ensemble Methods: Multiple models dengan different parameters
- Attention Visualization: Model interpretability improvements
- Feature Selection: Automated importance analysis
2. Class Imbalance Advanced Solutions
- SMOTE Integration: Synthetic minority oversampling
- Meta-learning: Learn optimal weights dari patterns
- Adaptive Thresholds: Dynamic adjustment based on volatility
- Cost-sensitive Learning: Asymmetric loss functions
3. Trading Strategy Enhancements
- Multi-asset Portfolio: Correlation-aware trading
- Advanced Risk Management: VaR, CVaR constraints
- Transaction Cost Modeling: Realistic trading costs
- Market Impact Modeling: Price impact untuk large orders
4. Technical Improvements
- Distributed Training: Multi-GPU support
- Model Compression: Quantization untuk production
- Real-time Pipeline: Optimized inference
- Advanced Backtesting: Comprehensive historical testing
5. Data Enhancements
- Alternative Data: News sentiment, social media
- Multi-timeframe: Intraday high-frequency data
- Market Microstructure: Order book integration
- Economic Indicators: Macro-economic factors
=, Research Contributions
Novel Techniques Implemented
- Dynamic Class Reweighting: Real-time adjustment based on action distribution
- Market Regime Stratification: Context-aware training dengan 5 market states
- Multi-dimensional Reward Balancing: Comprehensive reward enhancement
- Anti-passive Mechanisms: Forced action diversity untuk better exploration
- Progressive Imbalance Correction: Severity-based weight scaling
Academic Impact
- Addresses fundamental class imbalance problem dalam financial RL
- Provides practical solution untuk passive behavior dalam trading algorithms
- Demonstrates market regime-aware training effectiveness
- Shows significant improvement dalam action distribution balance
=Ê Performance Monitoring
Real-time Tracking
- Action distribution setiap 10 episodes
- Class weight evolution history
- Market regime transition tracking
- Imbalance correction frequency
Evaluation Timeframes
- Short-term: 200 steps evaluation
- Medium-term: 500 steps evaluation
- Long-term: 1000 steps evaluation
Key Performance Indicators
- Balance Status: EXCELLENT ’ GOOD ’ MODERATE ’ SEVERE
- Max Single Action: Target <60% untuk excellent balance
- Class-balanced Rewards: Additional rewards from balancing
- Regime Appropriateness: Action suitability untuk market conditions
<Æ Success Criteria
Primary Objectives
Eliminate passive behavior (HOLD <70%)
Achieve balanced actions (each action 20-40%)
Maintain trading performance (positive returns)
Adapt to market regimes (regime-appropriate actions)
Secondary Objectives
Real-time monitoring (class balance tracking)
Dynamic adaptation (weight adjustment)
Professional implementation (production-ready code)
Research contribution (novel imbalance solutions)
=Ú References & Research
Core Technologies
- PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers"
- PPO: "Proximal Policy Optimization Algorithms"
- Class Imbalance: "Deep Reinforcement Learning for Imbalanced Classification"
- Market Regimes: "Regime-aware Reinforcement Learning for Financial Trading"
Implementation Details
- Market regime classification menggunakan return dan volatility thresholds
- Dynamic reweighting dengan progressive scaling
- Stratified episode sampling untuk balanced training
- Multi-dimensional reward enhancement dengan class weights
> Contributing
- Fork the repository
- Create feature branch (
git checkout -b feature/enhancement
) - Implement improvements dengan proper testing
- Submit pull request dengan detailed description
Priority Areas for Contribution
- Multi-asset portfolio trading
- Alternative data integration
- Advanced risk management
- Real-time deployment optimization
=Ä License
This project is licensed under the MIT License - see the LICENSE file for details.
<‰ Conclusion
Sistem Complete Class-Balanced Trading System V6 merupakan solusi komprehensif untuk class imbalance problem dalam algorithmic trading. Dengan kombinasi PatchTST untuk feature extraction dan PPO dengan class balancing enhancement, sistem ini berhasil mentransformasi passive behavior menjadi active, balanced trading strategy.
Key Achievement: Dari HOLD=99% menjadi balanced distribution (SELL=25-35%, HOLD=30-50%, BUY=25-35%)
Last updated: 2024 | Version: 6.0 | Status: Production Ready