November 6, 2025By Coineras Team

AI Trading Trial: ChatGPT Drops 60% as DeepSeek, QWEN3 Manage Gains

AI Trading Trial: ChatGPT Drops 60% as DeepSeek, QWEN3 Manage Gains

In October, a live experiment handed six popular language models $10,000 each to autonomously trade six major cryptocurrencies under identical conditions. After two weeks, four models were in the red: ChatGPT logged a 60% drawdown and Gemini fell 57%. Only DeepSeek and QWEN3 finished with gains, though their profits were partially reduced by trading fees.

Key Developments

  • Each AI model received real funds: $10,000 per model
  • Trading universe: six large cryptocurrencies
  • Identical inputs: the models received the same market data every three minutes
  • Fully autonomous decision-making with no human intervention
  • Evaluation window: two weeks in October

Results at a Glance

  • ChatGPT: down 60%
  • Gemini: down 57%
  • DeepSeek: profitable, but gains trimmed by fees
  • QWEN3: profitable, but gains trimmed by fees
  • Overall: four of six models ended in loss after two weeks

Analytical dashboards tracking the trial’s performance showed a “Total Account Value” chart with pronounced swings—peaks and troughs across portfolios. Recent readings illustrated dispersion across strategies with values such as $12,422.83, $9,633.82, $8,859.63, $7,958.53, and $5,669.06, reflecting divergent outcomes and the impact of volatility. The visualization was associated with Nofi.ai.

Why It Matters

These findings highlight the limits of general-purpose language models in live, short-horizon crypto trading. Key friction points likely included:

  • Transaction costs and fees eroding thin edges
  • High volatility and rapid regime shifts typical of crypto markets
  • Latency and execution constraints for frequent decision cycles
  • Risk management challenges over a brief, high-variance window

While AI remains promising for signal discovery and automation, purpose-built quantitative systems and robust cost-aware execution typically underpin durable performance. Language models, designed primarily for natural language understanding, may require domain-specific training, hybrid architectures, or tighter constraints to perform reliably in live trading.

Market Impact

  • For traders: The results are a reminder that transaction costs, slippage, and risk controls can outweigh model-driven signals over short periods.
  • For institutions: Finance-native AI and systematic strategies with rigorous validation and monitoring appear more suitable than off-the-shelf LLMs for live trading.
  • For developers: Future iterations may benefit from longer evaluation windows, diversified strategies, and cost-aware reinforcement learning frameworks.

Looking Ahead

Further experiments with extended timelines, clearer execution rules, and hybrid AI-quant approaches could better capture whether language models can add durable edge in crypto trading. For now, the October trial underscores that most general LLMs struggled to preserve capital in live conditions, while the two profitable models saw their advantage meaningfully reduced by fees.

Stay Updated

Get the latest crypto news and market analysis delivered to your inbox.

View All News