arxiv:2602.07085

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Published on Feb 6

· Submitted by

Zhi Yang on Feb 10

#3 Paper of the day

QuantaAlpha

Upvote

Authors:

Zhi Yang ,

Xiaomin Yu ,

Abstract

Financial markets are noisy and non-stationary, making alpha mining highly sensitive to noise in backtesting results and sudden market regime shifts. While recent agentic frameworks improve alpha mining automation, they often lack controllable multi-round search and reliable reuse of validated experience. To address these challenges, we propose QuantaAlpha, an evolutionary alpha mining framework that treats each end-to-end mining run as a trajectory and improves factors through trajectory-level mutation and crossover operations. QuantaAlpha localizes suboptimal steps in each trajectory for targeted revision and recombines complementary high-reward segments to reuse effective patterns, enabling structured exploration and refinement across mining iterations. During factor generation, QuantaAlpha enforces semantic consistency across the hypothesis, factor expression, and executable code, while constraining the complexity and redundancy of the generated factor to mitigate crowding. Extensive experiments on the China Securities Index 300 (CSI 300) demonstrate consistent gains over strong baseline models and prior agentic systems. When utilizing GPT-5.2, QuantaAlpha achieves an Information Coefficient (IC) of 0.1501, with an Annualized Rate of Return (ARR) of 27.75% and a Maximum Drawdown (MDD) of 7.98%. Moreover, factors mined on CSI 300 transfer effectively to the China Securities Index 500 (CSI 500) and the Standard & Poor's 500 Index (S&P 500), delivering 160% and 137% cumulative excess return over four years, respectively, which indicates strong robustness of QuantaAlpha under market distribution shifts.

View arXiv page View PDF GitHub 11 Add to collection

Community

yangzhi1

Paper author Paper submitter about 9 hours ago

QuantaAlpha tackles noisy, non-stationary markets by evolving alpha-mining trajectories via mutation and crossover, enabling controllable multi-round search and reliable reuse of successful patterns. It enforces hypothesis–factor–code semantic consistency and limits complexity to reduce crowding. On CSI 300 it improves over strong baselines (GPT-5.2: IC 0.1501, ARR 27.75%, MDD 7.98%) and transfers well to CSI 500 and the S&P 500 under distribution shifts.