Gemini AI API
Adding AI features to your app using Google's Gemini model — text generation, document analysis, and structured data extraction from a Supabase edge function.
The "Why First" Scenario
You are building the EduTrack partner onboarding flow. A new cleaning partner signs up. They fill in their name, years of experience, and the services they offer. Your admin team previously had to write a professional profile bio for each partner before their profile went live — one by one, manually, taking 5–10 minutes each.
With the Gemini API: the moment the partner submits their details, your app sends those details to Gemini and asks it to write a 3-sentence professional bio. Gemini returns the bio in under 2 seconds. The bio is pre-populated in the partner's profile. The partner reviews it, adjusts if needed, and submits. No admin time spent.
Same pattern — extract key information from a scanned Aadhaar or PAN card uploaded as an image. Instead of an admin manually reading the document and typing the details into your system, Gemini reads the document and returns structured data. The admin just verifies.
This is what the Gemini API enables: AI capabilities — generation, analysis, extraction — callable from your backend code, triggered by events in your app.
The Excel Analogy
Imagine you had a colleague who could instantly:
- Write a professional paragraph about any topic you described to them
- Read any document you gave them and extract specific fields
- Answer any question based on the information you provided
You would send them a message (a prompt), they would reply with the result. You would never need to understand how they formed their answer — just how to write a clear question.
The Gemini API is that colleague, available 24 hours a day, responding in under 2 seconds, and accessible from your code with a single function call.
Getting a Gemini API Key
There are two routes. Which one you use depends on where you are in the project lifecycle.
Route 1 — AI Studio (For Development and Training)
aistudio.google.com is Google's playground for Gemini. It is the fastest way to get an API key.
Go to aistudio.google.com
Log in with the Google account associated with your project. AI Studio is free to use for development.
Click "Get API key" in the top navigation
AI Studio API Keys page. Click "Create API key" and select your existing Google Cloud project so the key bills against the same project as your other APIs.
This opens the API key management page.
Click "Create API key"
AI Studio will ask which Google Cloud project to associate the key with. Select your project — this is the same project you set up in the Google Cloud Console section.
Copy the key
Your key will look like: AIzaSyBxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
It starts with AIzaSy — the same prefix as your Maps API key.
Route 2 — Google Cloud Console (For Production)
For production deployments, create the key the same way as your Maps API key — through Google Cloud Console:
Enable the Generative Language API
Go to APIs & Services → Library. Search for Generative Language API and click Enable.
Create a new API key
Go to APIs & Services → Credentials → Create Credentials → API Key.
Restrict the key
In the key settings, restrict it to the Generative Language API only. This limits blast radius if the key is ever compromised.
Both routes produce a working API key. The AI Studio route is faster for getting started.
Where the Key Goes
The Gemini API key must never be in your React frontend code. It is a server-side secret.
Unlike the Maps JavaScript API key (which is intentionally public), the Gemini API key gives full access to the Gemini model under your account. If someone reads this key from your frontend code, they can use your Gemini quota at your expense — and potentially access any data you send to Gemini in future requests.
Store it as a Supabase edge function secret:
This stores the key encrypted in Supabase's secret store. Your edge functions read it as Deno.env.get('GEMINI_API_KEY'). It never appears in your codebase or .env file.
Never add GEMINI_API_KEY to your .env file for frontend access. Never add a VITE_GEMINI_API_KEY or NEXT_PUBLIC_GEMINI_API_KEY variable. Any environment variable prefixed with VITE_ or NEXT_PUBLIC_ is embedded in your built JavaScript and readable by anyone who inspects your app. Gemini API calls must only ever happen in edge functions or other server-side code.
Using the Gemini API in a Supabase Edge Function
Install the library in your edge function:
Edge functions run on Deno, which uses npm packages via npm: specifiers. No separate install step — just import in your function:
Basic text generation — partner bio:
Calling this edge function from your React app:
A Practical Example: Document Data Extraction
Gemini can read images and extract structured data from them. This is useful for KYC document processing in EduTrack — reading an Aadhaar card photo and extracting the name and ID number.
Gemini typically returns JSON wrapped in a markdown code fence. Always strip that wrapper before parsing. The pattern is: take the raw text response, remove any leading and trailing code fence markers, then parse the remainder as JSON.
Model Selection: Flash vs Pro
Google offers several Gemini models. The two you will use:
| Model | Use when | Speed | Cost |
|---|---|---|---|
gemini-1.5-flash | Most tasks — bio generation, data extraction, Q&A | Fast (1–2s) | Free tier covers 15 req/min, 1M tokens/day |
gemini-1.5-pro | Complex reasoning, long documents, nuanced generation | Slower (3–8s) | Smaller free tier |
Start with gemini-1.5-flash for everything. It handles the vast majority of tasks. Switch to Pro only if the output quality is genuinely insufficient for a specific use case.
Free Tier Limits
Gemini API free tier limits (as of 2025):
| Limit | Value |
|---|---|
| Requests per minute (Flash) | 15 |
| Tokens per day (Flash) | 1,000,000 |
| Requests per minute (Pro) | 2 |
| Tokens per day (Pro) | 50,000 |
For training projects and early-stage apps, these limits are generous. 1 million tokens per day at roughly 500 tokens per typical request means 2,000 AI calls per day — before you pay anything.
What is a token? A token is roughly 3/4 of a word. "Hello, I am building a home services app." is about 9 tokens. A 3-sentence bio costs about 100 tokens to generate. A full page of text is roughly 500 tokens. The 1-million-token daily limit means your training projects will never hit it.
What Gemini Cannot Do Reliably
Setting realistic expectations prevents bad product decisions:
| Task | Reliable? | Notes |
|---|---|---|
| Writing structured text (bios, descriptions) | Yes | Excellent |
| Extracting data from clear document images | Usually | Works well for clear scans, struggles with handwriting |
| Answering questions about text you provide | Yes | Standard RAG pattern |
| Real-time data (current prices, live events) | No | Gemini's knowledge has a cutoff date |
| Deterministic calculations (tax, GST amounts) | No | Use real code for arithmetic — never trust AI for financial calculations |
| Guaranteed JSON structure output | Mostly | Parse defensively, always handle parse failures |
The last point matters in production code: Gemini usually returns valid JSON when asked, but occasionally wraps it in markdown code fences or adds commentary. Always clean the response before parsing, as shown in the document extraction example above.