Published: May 14, 2024, Last updated: November 13, 2024
When we build features with AI models on the web, we often rely on server-side solutions for larger models. This is especially true for generative AI, where even the smallest models are about thousand times bigger than the median web page size. It's also true for other AI use cases, where models can range from 10s to 100s of megabytes. As these models aren't shared across websites, each site has to download them on page load. This is impractical for developers and users.
We're developing web platform APIs and browser features designed to integrate AI models, including large language models (LLMs), directly into the browser. This includes Gemini Nano, the most efficient version of the Gemini family of LLMs, designed to run locally on most modern desktop and laptop computers. With built-in AI, your website or web application can perform AI-powered tasks without needing to deploy or manage its own AI models.
Discover the benefits of built-in AI, our implementation plan, and how you can take advantage of this technology.
Get an early preview
We need your input to shape the APIs, ensure they fulfill your use cases, and inform our discussions with other browser vendors for standardization.
Join our early preview program to provide feedback on early-stage built-in AI ideas, and discover opportunities to test in-progress APIs through local prototyping.
Join the Chrome AI developer public announcements group to be notified when new APIs become available.
Benefits of built-in AI for web developers
With built-in AI, your browser provides and manages foundation and expert models.
As compared to building your own client-side AI, built-in AI offers the following benefits:
- Ease of deployment: As the browser distributes the models, it takes into account the capability of the device and manages updates to the model. This means you aren't responsible for downloading or updating large models over a network. You don't have to solve for storage eviction, runtime memory budget, serving costs, and other challenges.
- Access to hardware acceleration: The browser's AI runtime is optimized to make the most out of the available hardware, be it a GPU, an NPU, or falling back to the CPU. Consequently, your app can get the best performance on each device.
Benefits of running client-side
With a built-in AI approach, it becomes trivial to perform AI tasks client-side, which in turn offers the following upsides:
- Local processing of sensitive data: Client-side AI can improve your privacy story. For example, if you work with sensitive data, you can offer AI features to users with end-to-end encryption.
- Snappy user experience: In some cases, ditching the round trip to the server means you can offer near-instant results. Client-side AI can be the difference between a viable feature and a sub-optimal user experience.
- Greater access to AI: Your users' devices can shoulder some of the processing load in exchange for more access to features. For example, if you offer premium AI features, you could preview these features with client-side AI so that potential customers can see the benefits of your product, without additional cost to you. This hybrid approach can also help you manage inference costs especially on frequently used user flows.
- Offline AI usage: Your users can access AI features even when there is no internet connection. This means your sites and web apps can work as expected offline or with variable connectivity.
Hybrid AI: Client-side and server-side
While client-side AI can handle a large array of use cases, there are certain cases that require server-side support.
Server-side AI is a great option for large models, and it can support a wider range of platforms and devices.
You may consider a hybrid approach, depending on:
- Complexity: Specific, approachable use cases are easier to support with on-device AI. For complex use cases, consider server-side implementation.
- Resiliency: Use server-side by default, and use on-device when the device is offline or on a spotty connection.
- Graceful fallback: Adoption of browsers with built-in AI will take time, some models may be unavailable, and older or less powerful devices may not meet the hardware requirements for running all models optimally. Offer server-side AI for those users.
For Gemini models, you can use backend integration (with Python, Go, Node.js, or REST) or implement in your web application with the new Google AI client SDK for Web.
Browser architecture and APIs
To support built-in AI in Chrome, we created infrastructure to access foundation and expert models for on-device execution. This infrastructure is already powering innovative browser features, such as Help me write.
You can access built-in AI capabilities primarily with task APIs, such as the Translator API or the Summarizer API. Task APIs are designed to run inference against the best model for the assignment.
In Chrome, these APIs are built to run inference against Gemini Nano with fine-tuning or an expert model. Designed to run locally on most modern devices, Gemini Nano is best for language-related use cases, such as summarization, rephrasing, or categorization.
We're also providing exploratory APIs, such as the Prompt API, so that you can experiment locally and share additional use cases.
In the future, we may offer an exploratory LoRA API, to improve the built-in model's performance by adjusting the model's weights.
When to use built-in AI
Here are a few ways built-in AI can benefit you and your users:
- AI-enhanced content consumption: Including summarization, translation, categorization, characterization, and as a knowledge provider.
- AI-supported content creation: Such as writing assistance, proofreading, grammar correction, and rephrasing.
What's next
Several of the built-in AI APIs are available to test in origin trials. Exploratory APIs and other, early-stage APIs are available to early preview program participants.
Learn how to use Gemini Pro on Google's servers with your websites and web apps in our quickstart for the Google AI JavaScript SDK.