'AI Scientist' Unveiled Capable of Automating Scientific Discoveries


TEHRAN (Tasnim) – Sakana AI Labs has announced the development of an artificial intelligence system that it claims can fully automate the scientific discovery process, raising questions about the future of research and the role of human scientists.

Scientific discovery is a complex process, requiring a deep understanding of existing knowledge, the identification of gaps, the formulation of research questions, and the execution of experiments to find answers.

Once the results are obtained, they must be analyzed and interpreted, often leading to new questions.

The question arises: Can such a sophisticated process be automated?

Last week, Sakana AI Labs announced the creation of an "AI scientist," an artificial intelligence system designed to make scientific discoveries in the field of machine learning.

The system uses generative large language models (LLMs), similar to those behind AI chatbots like ChatGPT, to brainstorm, select promising ideas, code new algorithms, plot results, and write papers summarizing the experiments, including references.

Sakana claims that the AI tool can complete the entire lifecycle of a scientific experiment for just $15 per paper, less than the cost of a typical lunch.

However, this bold claim raises questions. Even if the technology works as intended, would an influx of AI-generated research papers benefit science?

A significant portion of scientific knowledge is openly accessible, with millions of papers available online through repositories like arXiv and PubMed.

LLMs, trained on this data, can mimic the language and patterns of scientific writing, making it unsurprising that they could produce what appears to be a legitimate scientific paper.

However, the crucial question remains: Can an AI system produce papers that contribute genuinely novel ideas?

Scientists value new and significant contributions to knowledge, not just reiterations of existing information.

Sakana's system attempts to address this by scoring new paper ideas for their similarity to existing research, discarding those that are too similar.

Additionally, the system includes a "peer review" step, using another LLM to assess the quality and novelty of the generated paper.

Despite these measures, feedback on Sakana AI's output has been mixed. Some have described the results as "endless scientific slop," and even the system's internal reviews often deem the papers weak.

While improvements are expected as technology evolves, the value of automated scientific papers remains uncertain.

The ability of LLMs to evaluate research quality is also under scrutiny. Recent work, soon to be published in Research Synthesis Methods, indicates that LLMs are not yet reliable in assessing bias in medical research, although this may improve over time.

Sakana's system is currently limited to computational research, which involves experiments conducted with code—a format that LLMs are well-suited to generate.

AI tools have long been developed to assist scientists, helping them find relevant publications and synthesize existing research.

Specialized search tools like Semantic Scholar, Elicit, Research Rabbit, scite, and Consensus use AI to support scientists in navigating the vast amounts of published work.

Text mining tools, such as PubTator, help identify key points in papers, while machine learning applications like Robot Reviewer assist in the synthesis and analysis of medical evidence.

These tools aim to enhance scientists' efficiency, not replace them.

However, Sakana AI's vision of a "fully AI-driven scientific ecosystem" raises concerns.

One major worry is that if AI-generated papers flood the scientific literature, future AI systems could be trained on AI output, potentially leading to a "model collapse" where innovation diminishes.

The implications extend beyond AI systems, with concerns about the proliferation of low-quality research.

With the possibility of producing a scientific paper for just $15, the risk of "paper mills" churning out fake studies increases.

The scientific community, already struggling with peer review challenges, could be overwhelmed by a deluge of questionable research.

Science is built on trust, with a strong emphasis on the integrity of the scientific process.

The introduction of AI as a key player in this ecosystem raises fundamental questions about the value of this process and the trustworthiness of AI-generated science.

Is this the direction we want scientific research to take?