Exclusive: Here's an inside look at the Pixel 9's breakthrough Tensor G4 chip
When you buy through links on our articles, Future and its syndication partners may earn a commission.
There's a reason why Google moved up its Made by Google event date to August, although the company never offered an official explanation. It's clear that Google wants the early Pixel 9 launch to steal some thunder from the upcoming iPhone 16.
To do that, Google demonstrated a wide range of Pixel 9 AI features, including Add Me for inserting someone to a photo after the fact to Call Notes for recording and summarizing phone calls. And the Google Pixel Screenshots feature can help you dig up details using natural language queries.
All of this is powered by the new Tensor G4 chip, which was explicitly designed to run Google's most advanced AI models. In fact, it's the first processor that can run Gemini Nano with multimodality, which means the Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL and Pixel 9 Pro Fold can all understand text, images and audio.
To get a deeper look at Google's secret AI weapon I spoke with Jesse Seed, group product manager for Google Silicon, and Zach Gleicher, product manager for Google DeepMind, about what the Tensor G4 chip can do and how it stands out.
What makes the Tensor G4 chip stand out in a sea of smartphones?
Jesse Seed: I think the biggest innovation that we made this year was being the first silicon and the first phone to run Gemini Nano with multi-modality. And that unlocks some very cool use cases, one of which is Pixel Screenshots. That's very handy if you're trying to remember things.
Another feature not related to the Gemini Nano model but I really love is the Add Me feature. Those of us who are the photographers of our our family or crew will definitely appreciate being able to go back in dynamically add the photographer in. And that's something that we worked a lot on to tune over 15 different machine learning models, and also using the Google Augmented Reality SDK.
Apple is about to make a big deal with its Apple Intelligence and iPhone 16. How confident are you that the Tensor G4 is one step ahead of the competition?
Seed: Helpful on-device intelligence is not new for us. From our beginning, the core mission of Tensor has been to bring Google machine learning innovations to your pocket. This year is no exception.
Our fourth generation chip, the new Tensor G4, is our fastest and most powerful SoC built for Pixel. It brings our newest and most capable model, Gemini Nano with Multimodality, to Pixel 9 thanks to joint design with Google DeepMind. Gemini Nano with Multimodality helps your phone understand text, image, audio and speech across new features like Pixel Screenshots and Call Notes, and continued features like Recorder Summarization.
The Tensor and Google DeepMind teams jointly design hardware and Gemini Nano models to ensure they perform optimally together. The fruits of this codesign can be seen in a number of ways, including: being able to scale a 3x more capable model to work efficiently on a phone in a very short time frame, and the ability to run both a 'full' and 'efficient' mode of the same model -- allowing us to achieve an industry leading peak output rate of 45 tokens/second peak when using the full version, and improving energy efficiency by 26% otherwise.
Optimizations like this in short design cycles are only possible thanks to our close collaboration. And when coupled with Pixel 9's extra memory capacity, this means the model will always be there, ready to assist you quickly when needed. And of course whether on device, in the cloud, or a blend of the two, with Google Gemini models and apps, data is never sent to a third party for processing.
How did you squeeze something as advanced as Gemini Nano down fit on a phone?
Zach Gleicher: At DeepMind we collaborate with a whole bunch of teams across Google, and we want to make sure that we're building Gemini models that meet the needs of all Google products. So as we were developing Gemini in collaboration with Android and Pixel, we realized that there was this need for on-device models. We saw this as a challenge because on the server side everyone was pushing for more capable models that were potentially bigger. And we, on the other hand, had all these interesting constraints that weren't present before on memory constraints, power consumption constraints, etc.
So in partnership with the Tensor team and Pixel, we were able to come together and understand what, what are the core use cases for these on-device models, what are the constraints for these on-device models, and we actually co-developed a model together. Which was a really exciting experience and made it possible to build something that was so capable and able to power these use cases.
For someone who hasn't upgraded their phone in 3-4 years, what's going to stand out to them with the G4 chip?
Seed: So improving what we call fundamentals like power and performance are very important for us. The Tensor G4, which is our fourth generation chip, is our most efficient and our most performant. And so we believe users will see that in everyday experiences like web performance or web browsing, as well as app launch and just overall snappiness of the user interface. I think it's a really smooth experience. You'll see it with things the web performance being 20% faster on average and app launch being 17% faster.
And what about gaming performance, as that's really important these days for people buying a new phone?
Seed: So in our testing, we actually have seen both improved peak and sustained performance in gaming and common games that run on the platform.
How does the Tensor G4 help with battery life?
Seed: We improved power efficiency on a lot of everyday use cases. So things like capturing video, taking a photo, scrolling through social media — all of that is consuming less power than the prior generation,
That's all contributing towards that 20% extra battery life that you saw mentioned in the keynote. So nearly 20% better battery life, Tensor G4 is contributing to and hitting that.
What are some of the AI features Gemini enables on Pixel 9 phones that you're most excited about?
Gleicher: Some of the main motivations that we see for the Tensor team and Pixel team coming to us for on-device use cases is better reliability. So the fact that you don't have to rely on a internet connection, the experience can be reliable and work no matter where you are.
Another thing we think about is privacy. If developers don't want the data to actually leave the device and be fully processed on device, that's possible with having a on-device LLM.
In terms of AI features I'm excited about, Pixel screenshots is a really great one. I think that really showcases how we are able to get these multi-modal features that are working on device that can work as you can see in the demos. It was really snappy, low latency, but it's also a super capable model. And all this information and data is stored locally on your device and can be processed locally. So we're really excited that Gemini nano can enable experiences like that.
I think we're seeing traction for summarization use cases and smart reply.
How is Pixel Screenshots different than Windows Recall, which got in hot water for privacy concerns?
Seed: One of the ways we protect user privacy is by having a capable on-device model. So that the analysis that's being done on that screenshot, none of it leaves the device. So that's one way that we're able to address that privacy concern.
I think the other thing is just empowering users to decide what they want to do, like how they want to use something like Gemini. And what use cases they feel comfortable interacting with and what they don't. So I think it really comes down to to user choice. But in the case [of] Pixel Screenshots in particular, that is a fully on-device use case.
We're going to run all the usual benchmarks with the Tensor G4, but the AI era also changes things. How are you thinking about performance with this chip?
Seed: I think it really all comes down to real-world use cases. Like how does this thing actually perform in hand? So I do think that things like how fast the web browsing response is, how fast apps are launching, the quickness and the responsiveness of the user interface, those are all sort of everyday use cases. Those are good standard things to look at.
What about from an AI perspective? When does a Pixel phone pass your test in terms of performance?
Gleicher: As we think about benchmarks for LLMs and Gemini, and especially as we think about Gemini Nano, we've seen the industry put a large focus on academic benchmarks. And academic benchmarks like MMLU are great, as it gives a common metric. But they could be gameified and people can optimize for them. And it might not capture what you really care about.
For an on-device model, we don't really care that it knows history questions. We think like that's probably a better use case for a server-side model. What we care about are use cases like summarization.
We also have to think about constraints like battery consumption. We have to make sure the model performs well and doesn't consume too much battery. And that also the latency is good. So we actually partner with the Tensor team to profile our models as we're co-designing these models together to make sure that we are getting an architecture that works well.
Seed: It's not just about traditional maybe metrics of performance, but also quality. So if you look at things like the quality of responses coming out of the model, or even things like quality of the photo. That's what real-world users in hand are going to care more about than some number on the side of a box.