Datacurve, a Y Combinator-backed startup focused on building advanced datasets for AI and software development, has closed a $15 million Series A round led by Chemistry. This fresh capital follows an earlier $2.7 million seed raise, bringing the company’s total funding to around $17.7 million.
Founded by Serena Ge and Charley Lee, Datacurve aims to solve a critical bottleneck in AI training: obtaining complex, real-world data that goes beyond simple training sets. The company’s platform produces research-grade coding challenges, debugging tasks, and private repository benchmarks designed to help AI models improve reasoning, problem-solving, and coding performance.
Datacurve’s unique bounty-based contributor system, Shipd, engages top engineers, including talent from DeepMind, OpenAI, Anthropic, and Vercel, to submit high-quality datasets through structured challenges. To date, Shipd has distributed over $1 million in bounties, creating an incentive-driven marketplace for valuable data contributions.
“We treat this as a consumer product, not a data labelling operation,” said Serena Ge, co-founder and CEO. “We spend a lot of time optimising the experience to attract and retain the engineers whose contributions matter most.”
With AI models becoming increasingly sophisticated, the need for more nuanced post-training datasets is growing rapidly. Datacurve’s data fills this gap by providing evaluation and fine-tuning resources essential for real-world model performance improvements.
Looking ahead, Ge and Lee plan to scale their team and platform further, with ambitions to extend beyond code data into sectors like finance, marketing, and healthcare.