Vertex AI API - Google Cloud Enterprise API Service
๐ Service Information
Provider: Google Vertex AI
Service Type: API Service
API Endpoint: https://[REGION]-aiplatform.googleapis.com (replace [REGION], e.g. us-central1)
Free Tier: Trial Credits ($300, 91 days)
๐ฏ Service Overview
Vertex AI API provides enterprise-grade AI capabilities, supporting Gemini series models, particularly suitable for large-scale production deployments and integration with Google Cloud services.
Key Advantages:
- ๐ฐ $300 Trial Credits - 91-day validity
- ๐ Gemini Series Models - Supports 1.5/2.x series, up to 2M super-long context
- ๐ข Enterprise Grade - Complete MLOps tools
- ๐ Security Compliance - Google Cloud standards
- ๐ Global Deployment - Multi-region availability
- ๐ Comprehensive Monitoring - Complete logs and metrics
๐ Quick Start
Prerequisites
Required:
- โ Google Cloud account created
- โ $300 trial credits activated
- โ Vertex AI API enabled
- โ Service account or API key created
For detailed steps, see: Vertex AI Registration Guide
5-Minute Quick Example
Install SDK:
pip install google-cloud-aiplatformUsage Example:
import vertexai
from vertexai.generative_models import GenerativeModel
# Initialize
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
# Create model instance
model = GenerativeModel("gemini-1.5-pro")
# Generate content
response = model.generate_content("Explain what Vertex AI is")
print(response.text)๐ค Supported Models
Gemini Pro Series
Model ID Example: gemini-1.5-pro or gemini-2.0-pro (check console for specific versions)
| Feature | Details |
|---|---|
| Context | Up to 2M tokens |
| Multimodal | Text, image, audio, video |
| Price | $1.25-2.50/M (in), $5-10/M (out) (example) |
Gemini Flash Series
Model ID Example: gemini-1.5-flash or gemini-2.0-flash (check console for specific versions)
| Feature | Details |
|---|---|
| Context | Up to 1M tokens |
| Speed | Extremely fast |
| Price | $0.075-0.15/M (in), $0.30-0.60/M (out) (example) |
๐ก Note: Google continuously updates model versions. Check official documentation for available models and pricing.
๐ API Usage Examples
1. Basic Conversation
import vertexai
from vertexai.generative_models import GenerativeModel
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
model = GenerativeModel("gemini-1.5-flash")
response = model.generate_content("Hello, introduce Gemini")
print(response.text)2. Streaming Output
model = GenerativeModel("gemini-1.5-flash")
response = model.generate_content(
"Write an article about AI",
stream=True
)
for chunk in response:
print(chunk.text, end="", flush=True)3. Multimodal Input (Images)
from vertexai.generative_models import GenerativeModel, Part
model = GenerativeModel("gemini-1.5-pro")
# Load image from local
image = Part.from_uri(
"gs://your-bucket/image.jpg",
mime_type="image/jpeg"
)
response = model.generate_content([
"What's in this image?",
image
])
print(response.text)4. Long Context Processing
# Process extremely long text (e.g., entire books)
with open("long_document.txt", "r") as f:
long_text = f.read() # Can be millions of words
model = GenerativeModel("gemini-1.5-pro")
response = model.generate_content(
f"Summarize the core points of the following document:\n\n{long_text}"
)
print(response.text)5. Function Calling
from vertexai.generative_models import (
GenerativeModel,
Tool,
FunctionDeclaration
)
# Define function
get_weather_func = FunctionDeclaration(
name="get_weather",
description="Get weather for specified city",
parameters={
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name"
}
},
"required": ["city"]
}
)
# Create tool
weather_tool = Tool(function_declarations=[get_weather_func])
# Use
model = GenerativeModel(
"gemini-1.5-pro",
tools=[weather_tool]
)
response = model.generate_content("What's the weather in Shanghai today?")
print(response)๐ข Costs and Quotas
Pricing
Gemini 1.5 Pro (example):
| Context | Input | Output |
|---|---|---|
| โค 128K | $1.25/M | $5/M |
| > 128K | $2.50/M | $10/M |
Gemini 1.5 Flash (example):
| Context | Input | Output |
|---|---|---|
| โค 128K | $0.075/M | $0.30/M |
| > 128K | $0.15/M | $0.60/M |
๐ก Note: Different model versions may have different pricing. Check official pricing page for details.
Trial Credits Usage Estimate
$300 estimate (for reference):
- Gemini 1.5 Pro: Process approximately 240M input tokens
- Gemini 1.5 Flash: Process approximately 4B input tokens
- Or any combination
๐ก Best Practices
โ Recommended Practices
Use Flash First
# Flash is 16x cheaper, sufficient for most scenarios model = GenerativeModel("gemini-1.5-flash")Batch Processing
# Batch process multiple requests responses = [] for prompt in prompts: response = model.generate_content(prompt) responses.append(response.text)Cost Monitoring
# Set budget alerts in GCP Console # Billing โ Budgets & alertsError Handling
from google.api_core import exceptions try: response = model.generate_content(prompt) except exceptions.ResourceExhausted: print("Quota exhausted") except exceptions.InvalidArgument: print("Parameter error")
๐ง Common Issues
1. How to Authenticate?
Method:
# Set environment variable
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account-key.json"
# Or in code
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "key.json"2. How to Choose Region?
Available Regions:
us-central1(recommended)europe-west1asia-southeast1
3. How to Monitor Usage?
Method:
- GCP Console โ Billing
- View Vertex AI usage
- Set budget alerts
๐ Related Resources
Official Documentation
SDKs and Tools
๐ Practical Case
Case: Long Document Analysis System
import vertexai
from vertexai.generative_models import GenerativeModel
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
class DocumentAnalyzer:
def __init__(self):
self.model = GenerativeModel("gemini-1.5-pro")
def analyze(self, document_text):
"""Analyze long document"""
prompt = f"""
Please analyze the following document and provide:
1. Core theme
2. Key points (5-10)
3. Summary (within 200 words)
Document content:
{document_text}
"""
response = self.model.generate_content(prompt)
return response.text
# Usage
analyzer = DocumentAnalyzer()
with open("long_doc.txt") as f:
result = analyzer.analyze(f.read())
print(result)Service Provider: Google Vertex AI