Google unveiled Gemini 1.5, the newest iteration in its family of artificial intelligence models, at its annual I/O event on Monday. Gemini 1.5 ushers in new features for the AI model—the central element behind the company's new virtual assistant—as well as improved processing times, which are critical in helping it carry out its functions on the web, apps, and assistants.
There are currently three versions of Gemini: Ultra, Pro, and Nano. Ultra is the main competitor to GPT 4; Pro competes with free solutions like GPT 3.5; and Nano is the integration we see in Google AI-enabled devices like the Google Pixel 8 or the Samsung Galaxy S24. The company's announcement centered on new features for Gemini 1.5, which is available through the Gemini Advanced subscription.
A lighter and faster model. The company presented "Gemini Flash," a new model that's faster and has lower latency. Gemini Flash has faster responsiveness than 1.5 Pro, designed for applications that require speed.
This is the most recent addition to the Gemini family of models, which Google optimized it for high-volume tasks. Despite being lighter than Pro, the company's developers point out that it had advanced multimodal reasoning capabilities, perfect for tasks like summarization, chat apps, image captioning, and document data extraction.
Improved processing power. Gemini 1.5 Pro is a model with increased processing power compared to previous versions. Now, it can analyze large documents, including files with more than 1,500 pages, an hour of video, and code bases of more than 30,000 lines. It can also summarize of up to 100 emails simultaneously.
Since the main advantage of Gemini 1.5 Pro is its processing power, Google plans to integrate it into Google Drive, allowing users to upload files from the storage service to Gemini. In other words, users can use the processing power of Gemini 1.5 Pro on their files.
Advances in image analysis. Google states that Gemini 1.5 Pro features considerable improvements when it comes to understanding the images users show it. For example, the company boasts that its model can solve mathematical problems step-by-step by analyzing a photo or give you recipes for a dish by simply looking at its composition and appearance.
This feature doesn’t just apply to multimedia, it can also be used with apps. Gemini can analyze the content of apps such as Google Meet or Gmail to create summaries, descriptions of what it sees, and more.
Gemini 1.5 Pro will be available to Gemini Advanced subscribers in over 150 countries and more than 35 languages.
Gemini to become more mobile-friendly. To enhance Gemini’s conversational capabilities, Google announced the launch of Gemini Live for Gemini Advanced subscribers and new features that will integrate Gemini into Android.
Among the new features is AI search in Google Photos, which can analyze the context of images, describe what it sees, and overall go far beyond current search capacilities. Google will also integrate Gemini into apps like Messages, where it will serve as conversational support.
On the other hand, Gemini Live is a new conversational interface exclusively for mobile. As OpenAI recently demonstrated with GPT-4o, users will be able to interrupt the bot just as they do in a regular conversation.
Similarly, Gemini Live will be able to "see" surroundings through the users’ camera and accurately describe the environment they’re in. These new features are currently being tested in English only.
Planning improvements. In addition to improvements in Gemini Flash, Gemini Live, and new Gemini Pro features, Google wants to provide Gemini Advanced with improvements in complex planning, such as travel itineraries, activities where people have choose between different options, etc.
One of the features that will be coming to the Advanced plan in the “next few months” is related to planning. For example, when someone asks Gemini to plan a trip, it will be able to consider flight schedules, hotel arrival times, and food preferences the user already indicated to get a develop a custom plan.
Gemini will base its recommendations on data from apps such as Gmail, Google Maps, and Google Search. It will also be able to modify the entire plan if the user’s itinerary changes.
Image | Google
View 0 comments