Cosine, a human reasoning lab building artificial general developers, has raised $2.5 million in funding. It was led by US venture firms Uphonest, which invested in Sending Labs and SOMA Capital, with participation from Lakestar (which backed Isar Aerospace and Firenze), and Focal, amongst others.
In addition to this, the startup also announced a breakthrough in AI-assisted software development with its artificial developer Genie.
Now was the idea born?
The startup was founded in 2022 by Alistair Pullen (who published and monetised his first software application aged 9), Sam Stenner, and Yang Li. Its software was created out of the founders’ realisation of the potential in using large language models (LLMs) to perform complex tasks in the coding space by imitating human software developers’ behaviours. Their primary goal is to create truly resilient AI capable of tackling open-ended problems across various domains.
What does the company do?
Based in London, Cosine is at the forefront of codifying human reasoning in AI systems. The Y Combinator startup is delivering the world’s most human-like, autonomous AI software developer to any company looking for ready to go AI developers to join their team without the friction of hiring.
Cosine’s Artificial Developer, Genie, works like a very good human developer. As per the company, it can solve bugs, build features, refactor code, and everything in between either fully autonomously or collaboratively with other developers. Also, the AI co-developer, lets you ask complex coding questions, get explanations about features, and plan new code just by asking its AI in plain English.
Beats rivals in AI benchmark
The company recently announced that it achieved a 30% score on SWE-Bench, the industry standard for evaluating software engineering skills in AI models. This represents a 56% improvement over the previous best score, held by Factory at 19%, and a 2196% improvement over OpenAI’s GPT4 score of 1.31%. Notably, this is the highest score achieved by the company to date.
The benchmark includes real-world human tasks in software architecture, debugging and the implementation of new features in existing codebases, and assesses an AI model’s ability to understand, modify, and generate complex code.
“Our breakthrough in codifying human reasoning is allowing us to train AI models to operate far beyond the narrow range of tasks and tightly restricted prompts currently available to teams developing software.” Commented Cosine CEO, Alistair Pullen, who published and monetised his first software application aged 9.
“We’ve developed a product capable of beating OpenAI and others in completing complex software tasks – in a fraction of the time and money it has taken our competitors to achieve the same results. We’re on course to radically transform the way development and developers work”, continued COO Yang Li.
“We are focused on creating a colleague, not a co-pilot” commented Sam Stenner, CIO at Cosine. “After we figured out how to generate data sets that codify human reasoning which can then be used to train LLMs, we knew the potential for what we had built and worked with OpenAI to fine tune their largest context window LLMs. We’re confident we now have the capabilities to consistently beat our own top score”.
“Cosine is not just improving AI; they’re fundamentally teaching AI to reason, providing companies with a true AI colleague”, said Ellen Ma, Partner at Uphonest Capital. “I’ve seen thousands of AI startups and no one has managed to capture human reasoning like Cosine. Genie is proof that their vision, strategy and team is perfect to get us closer to AGI”, said Ben Tossell, Founder of Ben’s Bites.