API - Google AI Studio

Google AI Studio API - Free Gemini API Service

๐Ÿ“‹ Service Information

Provider: Google AI Studio
Service Type: API Service
API Endpoint: https://generativelanguage.googleapis.com
Free Tier: Free Forever (with usage limits)


๐ŸŽฏ Service Overview

Google AI Studio API provides powerful access to Gemini models, fully compatible with OpenAI API format, allowing developers to easily integrate into their applications.

Key Advantages:

  • ๐ŸŽ Ultra-High Free Quota - Multiple models available for free, quota varies by model
  • โšก Fast Response - Flash series optimized
  • ๐Ÿ”„ OpenAI Compatible - Supports OpenAI API format (partial compatibility)
  • ๐ŸŽจ Multimodal API - Text, image, audio, video support
  • ๐Ÿ“ฑ Long Context - Up to 2M tokens (Pro series)
  • ๐Ÿ” Web Search - Google Search Grounding

๐Ÿš€ Quick Start

Prerequisites

Required:

  • โœ… API key created

For detailed steps, see: Google AI Studio Registration Guide

Get API Key

  1. Visit: https://aistudio.google.com
  2. Click “Get API key” in left menu
  3. Select or create a Google Cloud project
  4. Click “Create API key”
  5. Copy and save the key immediately

5-Minute Quick Example

Python Example:

Python
import google.generativeai as genai

# Configure API key
genai.configure(api_key="YOUR_API_KEY")

# Create model instance (using latest stable model)
model = genai.GenerativeModel('gemini-2.5-flash')

# Send request
response = model.generate_content("Hello, please introduce yourself")
print(response.text)

cURL Example:

Bash
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=YOUR_API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [{
      "parts":[{"text": "Hello, please introduce yourself"}]
    }]
  }'

๐Ÿค– Supported Models

Gemini 3 Series (Latest, Preview)

Model NameModel IDFeaturesFree Tier
Gemini 3 Flashgemini-3-flash-previewSmartest + Fastestโœ… Free
Gemini 3 Progemini-3-pro-previewMost PowerfulโŒ Paid

Gemini 2.5 Series (Stable, Recommended)

Model NameModel IDFeaturesFree Tier
Gemini 2.5 Flashgemini-2.5-flashHybrid reasoning modelโœ… Free
Gemini 2.5 Progemini-2.5-proAdvanced multi-purposeโœ… Free
Gemini 2.5 Flash-Litegemini-2.5-flash-liteLightweight efficientโœ… Free

Gemini 2.0 Series

Model NameModel IDFeaturesFree Tier
Gemini 2.0 Flashgemini-2.0-flashBalanced multimodalโœ… Free
Gemini 2.0 Flash-Litegemini-2.0-flash-liteCompact efficientโœ… Free

Gemma Open Source Series

Model NameModel IDFeaturesFree Tier
Gemma 3 27Bgemma-3-27b-instructLarge parametersโœ… Free
Gemma 3 12Bgemma-3-12b-instructMedium parametersโœ… Free
Gemma 3 4Bgemma-3-4b-instructSmall and fastโœ… Free

๐Ÿ”ข Quotas and Limits

Free Tier Explanation

Google AI Studio’s free tier is very generous, with all major models available for free:

FeatureFree Tier
Input TokensCompletely free
Output TokensCompletely free
Available ModelsGemini 3 Flash, 2.5 Flash/Pro, 2.0 Flash, Gemma 3, etc.
Rate LimitsVaries by model, see table below
Quota ResetDaily quota resets at Pacific Time midnight

Rate Limits (Free Tier)

Different models have different rate limits. Here are the limits for major models:

Model SeriesRequests/Minute (RPM)Tokens/Minute (TPM)Requests/Day (RPD)
Gemini 3 Flash10 RPM4M TPM1,500 RPD
Gemini 2.5 Flash15 RPM1M TPM1,500 RPD
Gemini 2.5 Pro2 RPM32K TPM50 RPD
Gemini 2.0 Flash15 RPM1M TPM1,500 RPD
Gemma 3 Series30 RPM15K TPM14,400 RPD

Context Length

ModelInput ContextOutput Length
Gemini 2.5 Flash1M tokens8K tokens
Gemini 2.5 Pro2M tokens8K tokens
Gemini 2.0 Flash1M tokens8K tokens
Gemma Series8K-128K tokens8K tokens

โš ๏ธ Important Notes

  1. Completely Free: Both input and output are completely free in the free tier
  2. Rate Limits: Exceeding rate limits will result in 429 errors, implement backoff retry is recommended
  3. Paid Upgrade: If you need higher quotas, you can request an upgrade in AI Studio
  4. Data Usage: Free tier data may be used to improve Google products (opt-out available), paid tier will not
  5. View Quota: Real-time quotas can be viewed in AI Studio Dashboard

๐Ÿ“– API Usage Examples

1. Basic Text Generation

Python SDK:

Python
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-2.5-flash')

# Simple conversation
response = model.generate_content("Explain what machine learning is")
print(response.text)

# Request with parameters
response = model.generate_content(
    "Write a poem about autumn",
    generation_config={
        "temperature": 0.9,
        "top_p": 0.95,
        "max_output_tokens": 1024,
    }
)
print(response.text)

REST API:

Bash
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=YOUR_API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [{
      "parts":[{"text": "Explain what machine learning is"}]
    }],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1024
    }
  }'

2. Multi-turn Conversation

Python SDK:

Python
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-2.5-flash')

# Create chat session
chat = model.start_chat(history=[])

# First round
response = chat.send_message("Hello, I want to learn Python")
print(response.text)

# Second round (model remembers context)
response = chat.send_message("Where should I start?")
print(response.text)

# View chat history
print(chat.history)

3. Image Understanding

Python SDK:

Python
import google.generativeai as genai
from PIL import Image

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-2.5-flash')

# Load image
image = Image.open('example.jpg')

# Send image and question
response = model.generate_content([
    "What's in this image? Please describe in detail.",
    image
])
print(response.text)

Load image from URL:

Python
import google.generativeai as genai
from PIL import Image
import requests
from io import BytesIO

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-2.5-flash')

# Load image from URL
response = requests.get('https://example.com/image.jpg')
image = Image.open(BytesIO(response.content))

# Analyze image
response = model.generate_content(["Analyze this image", image])
print(response.text)

4. Streaming Output

Python SDK:

Python
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-2.5-flash')

# Streaming output
response = model.generate_content(
    "Write an article about artificial intelligence",
    stream=True
)

for chunk in response:
    print(chunk.text, end='')

5. Function Calling

Python SDK:

Python
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# Define function
def get_weather(city: str):
    """Get weather for specified city"""
    # Should call real weather API here
    return f"Weather in {city}: Sunny, 25ยฐC"

# Define function description
tools = [{
    "function_declarations": [{
        "name": "get_weather",
        "description": "Get weather information for specified city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["city"]
        }
    }]
}]

model = genai.GenerativeModel(
    'gemini-2.5-flash',
    tools=tools
)

# Send request
chat = model.start_chat()
response = chat.send_message("What's the weather in Shanghai today?")

# Check if function call is needed
if response.candidates[0].content.parts[0].function_call:
    function_call = response.candidates[0].content.parts[0].function_call
    
    # Call function
    if function_call.name == "get_weather":
        result = get_weather(function_call.args["city"])
        
        # Return result to model
        response = chat.send_message({
            "function_response": {
                "name": "get_weather",
                "response": {"result": result}
            }
        })
        print(response.text)

6. Google Search Grounding

Python SDK:

Python
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# Enable Google Search Grounding
model = genai.GenerativeModel('gemini-2.5-flash')

# โš ๏ธ Note: Grounding feature may be separately charged, check latest pricing
response = model.generate_content(
    "What are the latest AI technology trends?",
    tools=[{"google_search_retrieval": {}}]
)

print(response.text)

# View citation sources
for candidate in response.candidates:
    if hasattr(candidate, 'grounding_metadata'):
        print("\nCitation Sources:")
        for attribution in candidate.grounding_metadata.grounding_attributions:
            print(f"- {attribution.source_id.grounding_passage.url}")

๐Ÿ› ๏ธ SDKs and Client Libraries

Official SDKs

Python:

Bash
pip install google-generativeai

Node.js:

Bash
npm install @google/generative-ai

Go:

Bash
go get github.com/google/generative-ai-go

OpenAI Compatibility

Google AI Studio API provides OpenAI compatibility layer, you can use OpenAI SDK:

Python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_GOOGLE_API_KEY",
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[
        {"role": "user", "content": "Hello"}
    ]
)

print(response.choices[0].message.content)

โš ๏ธ Note: Compatibility is not 100% identical, some parameters and features may differ, refer to official compatibility documentation


๐Ÿ’ก Best Practices

โœ… Recommended Practices

  1. Choose the Right Model

    • High-frequency calls: Gemini 3 Flash (fastest)
    • Daily tasks: Gemini 2.5 Flash (balanced)
    • Complex tasks: Gemini Pro (stronger)
    • Open source needs: Gemma series
  2. Optimize Quota Usage

    # Use streaming to reduce wait time
    response = model.generate_content(prompt, stream=True)
    
    # Set reasonable max_output_tokens
    generation_config = {"max_output_tokens": 512}
    
    # Batch process requests
    prompts = ["Question 1", "Question 2", "Question 3"]
    responses = [model.generate_content(p) for p in prompts]
  3. Error Handling and Retries

    import time
    from google.api_core import retry
    
    @retry.Retry(predicate=retry.if_exception_type(Exception))
    def call_api_with_retry(prompt):
        return model.generate_content(prompt)
    
    try:
        response = call_api_with_retry("Hello")
    except Exception as e:
        print(f"API call failed: {e}")
  4. Monitor Quota Usage

    # Log API calls
    import logging
    
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)
    
    def track_api_call(model, prompt):
        logger.info(f"Calling model: {model}")
        response = model.generate_content(prompt)
        logger.info(f"Tokens used: {response.usage_metadata.total_token_count}")
        return response
  5. Securely Manage API Keys

    # Use environment variables
    import os
    api_key = os.getenv('GOOGLE_AI_API_KEY')
    
    # Or use config file
    import json
    with open('config.json') as f:
        config = json.load(f)
        api_key = config['api_key']

โš ๏ธ Practices to Avoid

  1. โŒ Hard-coding API keys in code
  2. โŒ Frequently sending the same requests (use caching)
  3. โŒ Ignoring rate limits (implement backoff retry)
  4. โŒ Not handling errors and exceptions
  5. โŒ Using excessively large max_output_tokens

๐Ÿ”ง Common Issues

1. 403 Error: API key not valid

Cause:

  • API key is incorrect or expired
  • API is not enabled

Solution:

Python
# Check API key
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")

# List available models to verify
for model in genai.list_models():
    print(model.name)

2. 429 Error: Resource exhausted

Cause:

  • Exceeded rate limit
  • Exceeded daily quota

Solution:

Python
import time
from google.api_core.exceptions import ResourceExhausted

def call_with_backoff(prompt, max_retries=3):
    for i in range(max_retries):
        try:
            return model.generate_content(prompt)
        except ResourceExhausted:
            if i < max_retries - 1:
                wait_time = 2 ** i  # Exponential backoff
                print(f"Rate limited, waiting {wait_time} seconds...")
                time.sleep(wait_time)
            else:
                raise

3. Network Timeout

Cause:

  • Unstable network
  • Request content too large

Solution:

Python
from google.api_core import timeout

# Set timeout
timeout_config = timeout.ExponentialTimeout(initial=30, maximum=120)
response = model.generate_content(prompt, timeout=timeout_config)

4. Access Issues in Mainland China

Solution:

  • Use proxy or VPN
  • Configure proxy:
    import os
    os.environ['HTTP_PROXY'] = 'http://your-proxy:port'
    os.environ['HTTPS_PROXY'] = 'https://your-proxy:port'

๐Ÿ“Š Performance Optimization

1. Use Caching

Python
from functools import lru_cache

@lru_cache(maxsize=100)
def cached_api_call(prompt):
    return model.generate_content(prompt).text

2. Batch Processing

Python
import asyncio
from google.generativeai import GenerativeModel

async def batch_generate(prompts):
    model = GenerativeModel('gemini-2.5-flash')
    tasks = [model.generate_content_async(p) for p in prompts]
    return await asyncio.gather(*tasks)

# Usage
prompts = ["Question 1", "Question 2", "Question 3"]
responses = asyncio.run(batch_generate(prompts))

3. Streaming Large Text

Python
def process_large_text(text, chunk_size=10000):
    chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
    results = []
    
    for chunk in chunks:
        response = model.generate_content(f"Summarize: {chunk}")
        results.append(response.text)
    
    return " ".join(results)

๐Ÿ“š Related Resources

Official Documentation

Code Examples

Learning Resources


Service Provider: Google AI Studio

Last updated on