Generative AI Creates a Gap Between Local and Cloud-Based Models. There’s Room for Both

In 2023, generative AI became more widespread than ever after the launch of ChatGPT at the end of the previous year. This year, tech companies are solidifying their initial investments in generative AI with two main strategies: integrating AI into devices locally and providing cloud services.

Why this matters. Generative AI enhances privacy and reduces the need for users to have a strong Internet connection. For manufacturers, it also lowers infrastructure costs. On the other hand, using cloud-based AI means that queries are processed off our devices, relying on internet connectivity and using bandwidth that the provider needs to pay for. This is the case of OpenAI and Anthropic.

The locally-hosted approach. This AI approach enhances the appeal of devices and prevents them from becoming obsolete. However, its potential is constrained by the available hardware.

Google announced Gemini Nano a few months ago, a model designed for basic tasks that can run on a smartphone. It enhances the user experience without relying on connectivity and without adding costs for Google’s cloud services. The primary focus is on enhancing Android and Pixel phones.
Apple has shown signs of a local approach capable of handling basic tasks without the need of an Internet connection. This would help the company avoid putting all the processing load on its servers. The company decided to go this route with Siri, which was initially entirely dependent on the cloud but later gained local capabilities.
There are models available for local execution, but they require some technical knowledge to install and configure, as well as hardware that matches their requirements. This can be achieved with software such as Ollama or Opera, which allows large language models (LLMs) to be installed.

The cloud-hosted approach. Cloud-based AI enhances the ability to perform complex tasks and enables any device, even older smartphones, to achieve excellent results quickly.

Google incorporates Gemini’s new features into its Pro and Ultra versions, which are processed on its own servers. In addition to improving Android and Pixel phones locally, the company also aims to sell subscriptions to Gemini Advanced and similar services through the cloud.
Apple can use local methods to save costs and not rely entirely on its network for iOS’s generative AI. However, it’s also preparing to utilize its own servers and enter into partnerships with third parties.
OpenAI doesn’t have its own operating system or device for controlling local processes. Therefore, it relies entirely on the cloud, with the same aim as Google: to make ChatGPT Plus subscriptions or corporate plans more appealing.
Anthropic faces the same reality as ChatGPT, especially after its international expansion.
Meta is planning to run Meta AI in the cloud because it’s mainly focused on integrating it with its own apps. On the other hand, Llama 3 has four basic locally-hosted versions. However, due to its weight and the absence of proprietary devices (except for the Quest and Ray-Ban glasses), it’s opting for the cloud route.

Microsoft is concentrating on revitalizing its traditional products and distancing them from its competitors. As part of its cloud-focused strategy, everything, including Windows, Office, Edge, and Bing, is being routed through its own infrastructure, Azure.

Image | Xataka On and Midjourney

Generative AI Creates a Gap Between Local and Cloud-Based Models. There’s Room for Both

Locally-hosted AI models increase privacy and don’t depend on connectivity. However, they require more resources.

Their cloud-based counterparts outsource power but come with an ongoing expense that someone has to pay for.

Google and Apple are betting on combined models. They have incentives to do so.

RECEIVE "Xatakaletter", OUR WEEKLY NEWSLETTER