On Tuesday, Google unveiled its new family of Pixel phones, including the Pixel 9, Pixel 9 Pro, and Pixel 9 Pro XL. Alongside the new phones, the company introduced several new features related to artificial intelligence, particularly focusing on the Gemini assistant.
One notable improvement is Gemini Live, a multi-modal system that offers various enhancements, including a voice mode that resembles the latest ChatGPT. This feature allows users to have natural conversations with Gemini and even interrupt it.
Gemini, Google’s AI Assistant, Has a New Voice Mode
Before Tuesday’s launch, users could already communicate with Gemini, but the new model aims to distinguish itself with elements like fluency (with low latency) and multimodality. Now, the assistant promises to understand the context and has information about its users that can help it perform tasks better.
For instance, you’ll be able to ask Gemini in natural language to create a new reminder and add an event to your calendar. There are two clear advantages here. While you could do this before with Google Assistant, you had to limit yourself to using very structured language and commands so that the phone could understand you.
This limitation is disappearing with the new model. On the other hand, the previous version of Gemini didn’t allow users to perform actions on the system. It was basically a mirror of what you could do in the web version. However, it now assumes the role of a real assistant on your phone.
Gemini, integrated with Android, offers more than just screen reading. It allows interaction with several apps you use on a daily basis. For instance, users can drag and drop Gemini-generated images directly into apps like Gmail and Messages.
At a multimodality level, Gemini can now comprehend an image and engage in a conversation about it. This means users can take a picture of a medical appointment and ask the assistant to create an event based on the information in the picture. It appears to be a practical and useful feature.
Gemi Live also introduces 10 new voices that sound much more natural, replacing the previously robotic ones. It’s important to note that this new feature is currently available, at least for now, in English for Android users who have purchased Gemini Advanced. According to Google, it’ll be available on iOS “in the coming weeks.”
Pixel Screenshots, a Screenshot Ally
When users take screenshots, they often do so to save certain information for later use. Google’s Pixel Screenshots feature aims to make using this information easier. It’s a new feature powered by Gemini Nano that operates completely locally.
Whenever you save a screenshot on a Pixel phone, the device will extract all the information it finds, such as addresses, items, and prices. It’ll also include accompanying metadata, such as the app or web page from which the screenshot was taken and the date it was captured. All this data will be stored in Pixel Screenshots.
When you open the Pixel Screenshots app, you’ll find several options that allow you to organize and utilize the information obtained through the screenshots. For example, the algorithms will let you group the screenshots by ideas or themes, and you can add tags to easily identify them later. Screenshots are becoming more sophisticated.
The Gemini-powered app will also enable you to interact with the information in your screenshots. For instance, if you want to find the tracking number of a package, you can simply ask a question in natural language to get what you need, always accompanied by the original image containing the information.
This article was written by Javier Márquez and originally published in Spanish on Xataka.
Image | Google
Related | How to Use Gemini on Your iPhone With the Google App
View 0 comments