Gemini AI API

Adding AI features to your app using Google's Gemini model — text generation, document analysis, and structured data extraction from a Supabase edge function.

The "Why First" Scenario

You are building the EduTrack partner onboarding flow. A new cleaning partner signs up. They fill in their name, years of experience, and the services they offer. Your admin team previously had to write a professional profile bio for each partner before their profile went live — one by one, manually, taking 5–10 minutes each.

With the Gemini API: the moment the partner submits their details, your app sends those details to Gemini and asks it to write a 3-sentence professional bio. Gemini returns the bio in under 2 seconds. The bio is pre-populated in the partner's profile. The partner reviews it, adjusts if needed, and submits. No admin time spent.

Same pattern — extract key information from a scanned Aadhaar or PAN card uploaded as an image. Instead of an admin manually reading the document and typing the details into your system, Gemini reads the document and returns structured data. The admin just verifies.

This is what the Gemini API enables: AI capabilities — generation, analysis, extraction — callable from your backend code, triggered by events in your app.

The Excel Analogy

Imagine you had a colleague who could instantly:

Write a professional paragraph about any topic you described to them
Read any document you gave them and extract specific fields
Answer any question based on the information you provided

You would send them a message (a prompt), they would reply with the result. You would never need to understand how they formed their answer — just how to write a clear question.

The Gemini API is that colleague, available 24 hours a day, responding in under 2 seconds, and accessible from your code with a single function call.

Getting a Gemini API Key

There are two routes. Which one you use depends on where you are in the project lifecycle.

Route 1 — AI Studio (For Development and Training)

aistudio.google.com is Google's playground for Gemini. It is the fastest way to get an API key.

Go to aistudio.google.com

Google AI Studio — API Keys page AI Studio API Keys page. Click "Create API key" and select your existing Google Cloud project so the key bills against the same project as your other APIs.

This opens the API key management page.

Click "Create API key"

AI Studio will ask which Google Cloud project to associate the key with. Select your project — this is the same project you set up in the Google Cloud Console section.

Copy the key

Your key will look like: AIzaSyBxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

It starts with AIzaSy — the same prefix as your Maps API key.

Route 2 — Google Cloud Console (For Production)

For production deployments, create the key the same way as your Maps API key — through Google Cloud Console:

Enable the Generative Language API

Go to APIs & Services → Library. Search for Generative Language API and click Enable.

Create a new API key

Go to APIs & Services → Credentials → Create Credentials → API Key.

Restrict the key

In the key settings, restrict it to the Generative Language API only. This limits blast radius if the key is ever compromised.

Both routes produce a working API key. The AI Studio route is faster for getting started.

Your Turn — Use Route 1 (AI Studio) now to get your development key. You can create a production-restricted key through Google Cloud Console when you are ready to deploy.

Where the Key Goes

The Gemini API key must never be in your React frontend code. It is a server-side secret.

Unlike the Maps JavaScript API key (which is intentionally public), the Gemini API key gives full access to the Gemini model under your account. If someone reads this key from your frontend code, they can use your Gemini quota at your expense — and potentially access any data you send to Gemini in future requests.

Store it as a Supabase edge function secret:

This stores the key encrypted in Supabase's secret store. Your edge functions read it as Deno.env.get('GEMINI_API_KEY'). It never appears in your codebase or .env file.

Never add GEMINI_API_KEY to your .env file for frontend access. Never add a VITE_GEMINI_API_KEY or NEXT_PUBLIC_GEMINI_API_KEY variable. Any environment variable prefixed with VITE_ or NEXT_PUBLIC_ is embedded in your built JavaScript and readable by anyone who inspects your app. Gemini API calls must only ever happen in edge functions or other server-side code.

Using the Gemini API in a Supabase Edge Function

Install the library in your edge function:

Edge functions run on Deno, which uses npm packages via npm: specifiers. No separate install step — just import in your function:

Basic text generation — partner bio:

Calling this edge function from your React app:

A Practical Example: Document Data Extraction

Gemini can read images and extract structured data from them. This is useful for KYC document processing in EduTrack — reading an Aadhaar card photo and extracting the name and ID number.

Gemini typically returns JSON wrapped in a markdown code fence. Always strip that wrapper before parsing. The pattern is: take the raw text response, remove any leading and trailing code fence markers, then parse the remainder as JSON.

Model Selection: Flash vs Pro

Google offers several Gemini models. The two you will use:

Model	Use when	Speed	Cost
`gemini-1.5-flash`	Most tasks — bio generation, data extraction, Q&A	Fast (1–2s)	Free tier covers 15 req/min, 1M tokens/day
`gemini-1.5-pro`	Complex reasoning, long documents, nuanced generation	Slower (3–8s)	Smaller free tier

Start with gemini-1.5-flash for everything. It handles the vast majority of tasks. Switch to Pro only if the output quality is genuinely insufficient for a specific use case.

Free Tier Limits

Gemini API free tier limits (as of 2025):

Limit	Value
Requests per minute (Flash)	15
Tokens per day (Flash)	1,000,000
Requests per minute (Pro)	2
Tokens per day (Pro)	50,000

For training projects and early-stage apps, these limits are generous. 1 million tokens per day at roughly 500 tokens per typical request means 2,000 AI calls per day — before you pay anything.

What is a token? A token is roughly 3/4 of a word. "Hello, I am building a home services app." is about 9 tokens. A 3-sentence bio costs about 100 tokens to generate. A full page of text is roughly 500 tokens. The 1-million-token daily limit means your training projects will never hit it.

What Gemini Cannot Do Reliably

Setting realistic expectations prevents bad product decisions:

Task	Reliable?	Notes
Writing structured text (bios, descriptions)	Yes	Excellent
Extracting data from clear document images	Usually	Works well for clear scans, struggles with handwriting
Answering questions about text you provide	Yes	Standard RAG pattern
Real-time data (current prices, live events)	No	Gemini's knowledge has a cutoff date
Deterministic calculations (tax, GST amounts)	No	Use real code for arithmetic — never trust AI for financial calculations
Guaranteed JSON structure output	Mostly	Parse defensively, always handle parse failures

The last point matters in production code: Gemini usually returns valid JSON when asked, but occasionally wraps it in markdown code fences or adds commentary. Always clean the response before parsing, as shown in the document extraction example above.

The "Why First" Scenario

You are building the EduTrack partner onboarding flow. A new cleaning partner signs up. They fill in their name, years of experience, and the services they offer. Your admin team previously had to write a professional profile bio for each partner before their profile went live — one by one, manually, taking 5–10 minutes each.

With the Gemini API: the moment the partner submits their details, your app sends those details to Gemini and asks it to write a 3-sentence professional bio. Gemini returns the bio in under 2 seconds. The bio is pre-populated in the partner's profile. The partner reviews it, adjusts if needed, and submits. No admin time spent.

Same pattern — extract key information from a scanned Aadhaar or PAN card uploaded as an image. Instead of an admin manually reading the document and typing the details into your system, Gemini reads the document and returns structured data. The admin just verifies.

This is what the Gemini API enables: AI capabilities — generation, analysis, extraction — callable from your backend code, triggered by events in your app.

The Excel Analogy

Imagine you had a colleague who could instantly:

Write a professional paragraph about any topic you described to them
Read any document you gave them and extract specific fields
Answer any question based on the information you provided

You would send them a message (a prompt), they would reply with the result. You would never need to understand how they formed their answer — just how to write a clear question.

The Gemini API is that colleague, available 24 hours a day, responding in under 2 seconds, and accessible from your code with a single function call.

Getting a Gemini API Key

There are two routes. Which one you use depends on where you are in the project lifecycle.

Route 1 — AI Studio (For Development and Training)

aistudio.google.com is Google's playground for Gemini. It is the fastest way to get an API key.

Go to aistudio.google.com

Google AI Studio — API Keys page AI Studio API Keys page. Click "Create API key" and select your existing Google Cloud project so the key bills against the same project as your other APIs.

This opens the API key management page.

Click "Create API key"

AI Studio will ask which Google Cloud project to associate the key with. Select your project — this is the same project you set up in the Google Cloud Console section.

Copy the key

Your key will look like: AIzaSyBxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

It starts with AIzaSy — the same prefix as your Maps API key.

Route 2 — Google Cloud Console (For Production)

For production deployments, create the key the same way as your Maps API key — through Google Cloud Console:

Enable the Generative Language API

Go to APIs & Services → Library. Search for Generative Language API and click Enable.

Create a new API key

Go to APIs & Services → Credentials → Create Credentials → API Key.

Restrict the key

In the key settings, restrict it to the Generative Language API only. This limits blast radius if the key is ever compromised.

Both routes produce a working API key. The AI Studio route is faster for getting started.

Your Turn — Use Route 1 (AI Studio) now to get your development key. You can create a production-restricted key through Google Cloud Console when you are ready to deploy.

Where the Key Goes

The Gemini API key must never be in your React frontend code. It is a server-side secret.

Store it as a Supabase edge function secret:

This stores the key encrypted in Supabase's secret store. Your edge functions read it as Deno.env.get('GEMINI_API_KEY'). It never appears in your codebase or .env file.

Using the Gemini API in a Supabase Edge Function

Install the library in your edge function:

Edge functions run on Deno, which uses npm packages via npm: specifiers. No separate install step — just import in your function:

Basic text generation — partner bio:

Calling this edge function from your React app:

A Practical Example: Document Data Extraction

Gemini can read images and extract structured data from them. This is useful for KYC document processing in EduTrack — reading an Aadhaar card photo and extracting the name and ID number.

Model Selection: Flash vs Pro

Google offers several Gemini models. The two you will use:

Model	Use when	Speed	Cost
`gemini-1.5-flash`	Most tasks — bio generation, data extraction, Q&A	Fast (1–2s)	Free tier covers 15 req/min, 1M tokens/day
`gemini-1.5-pro`	Complex reasoning, long documents, nuanced generation	Slower (3–8s)	Smaller free tier

Start with gemini-1.5-flash for everything. It handles the vast majority of tasks. Switch to Pro only if the output quality is genuinely insufficient for a specific use case.

Free Tier Limits

Gemini API free tier limits (as of 2025):

Limit	Value
Requests per minute (Flash)	15
Tokens per day (Flash)	1,000,000
Requests per minute (Pro)	2
Tokens per day (Pro)	50,000

For training projects and early-stage apps, these limits are generous. 1 million tokens per day at roughly 500 tokens per typical request means 2,000 AI calls per day — before you pay anything.

What Gemini Cannot Do Reliably

Setting realistic expectations prevents bad product decisions:

Task	Reliable?	Notes
Writing structured text (bios, descriptions)	Yes	Excellent
Extracting data from clear document images	Usually	Works well for clear scans, struggles with handwriting
Answering questions about text you provide	Yes	Standard RAG pattern
Real-time data (current prices, live events)	No	Gemini's knowledge has a cutoff date
Deterministic calculations (tax, GST amounts)	No	Use real code for arithmetic — never trust AI for financial calculations
Guaranteed JSON structure output	Mostly	Parse defensively, always handle parse failures

The "Why First" Scenario

The Excel Analogy

Getting a Gemini API Key

Route 1 — AI Studio (For Development and Training)

Go to aistudio.google.com

Click "Get API key" in the top navigation

Click "Create API key"

Copy the key

Route 2 — Google Cloud Console (For Production)

Enable the Generative Language API

Create a new API key

Restrict the key

Where the Key Goes

Using the Gemini API in a Supabase Edge Function

A Practical Example: Document Data Extraction

Model Selection: Flash vs Pro

Free Tier Limits

What Gemini Cannot Do Reliably

On this page

Gemini AI API

The "Why First" Scenario

The Excel Analogy

Getting a Gemini API Key

Route 1 — AI Studio (For Development and Training)

Go to aistudio.google.com

Click "Get API key" in the top navigation

Click "Create API key"

Copy the key

Route 2 — Google Cloud Console (For Production)

Enable the Generative Language API

Create a new API key

Restrict the key

Where the Key Goes

Using the Gemini API in a Supabase Edge Function

A Practical Example: Document Data Extraction

Model Selection: Flash vs Pro

Free Tier Limits

What Gemini Cannot Do Reliably

On this page