SearchGPT Could Be Google’s Biggest Threat Ever. But to Really Become a Threat, It Needs a Miracle: Not to Screw It Up

If you ask ChatGPT which number is larger, 9.11 or 9.9, it will answer incorrectly. In fact, it’s not the only AI model that gets this simple math question wrong.

Claude also answers incorrectly, but at least Gemini, Le Chat (Mistral’s chatbot), Copilot, and Llama 3.1 (405B) get it right, in some cases, by perfectly explaining their answer to this trick question.

We’re talking about the most advanced chatbots on the market from companies that have invested enormous amounts of money, manpower, and resources into training these generative AI models. And yet, in many cases, we’re proving once again something I never get tired of saying:

Chatbots screw up—a lot.

Left to right and top to bottom: ChatGPT, Gemini, Claude, Mistral, Copilot, and Llama 3.1 (405B, via HuggingFace). ChatGPT and Claude fail, while the others respond correctly. Copilot—which, curiously, is based on GPT-4—not only answers well but also shows the best explanation.

They do it all the time with basic math problems like this and other questions. By now, we’re used to AI chatbots failing—hello, glue pizza—and while they can be helpful, you always have to check all their answers. Programmers know this all too well: Around half of the answers ChatGPT shows to programming questions are wrong.

The people behind these AI models clearly state that their chatbots’ answers can be wrong. Their stochastic parrots respond to probabilistic patterns and have no idea what they’re saying. Developers have refined the operation of these AI models, and in many cases, almost surprisingly, they respond with complete accuracy.

However, in the face of uncertainty—“Is ChatGPT getting it right or is it making it all up?”—rises an important question. If we can’t fully trust an AI chatbot, how can we trust an AI-based search engine?

This is what OpenAI is now proposing with its SearchGPT search engine. Its creators have classified this tool as a “prototype”—they even include it in the URL of the announcement page—and only a few users can access it.

There is at least one crucial point here: SearchGPT, like Google’s search engine, includes attribution and links to the sources of the results. This feature is essential for the search engine’s credibility and is also necessary for search engines in general. Perplexity, the first major independent AI-based search engine, seems to have inspired SearchGPT.

While ChatGPT and Copilot have been able to search the web for some time, this is the first time OpenAI has created a product specifically designed as a search engine. It seemed inevitable, especially given that more users, like me, are searching directly on ChatGPT or other AI chatbots.

SearchGPT Could Be Google’s Biggest Threat Ever. But to Really Become a Threat, It Needs a Miracle: Not to Screw It Up

If ChatGPT taught us anything, it's that generative AI models show wrong answers and make things up.

So, how can we trust a search engine powered by generative AI?

Receive "Xatakaletter", our weekly newsletter