API - Google Vertex AI

Vertex AI API - Google Cloud Enterprise API Service

๐Ÿ“‹ Service Information

Provider: Google Vertex AI
Service Type: API Service
API Endpoint: https://[REGION]-aiplatform.googleapis.com (replace [REGION], e.g. us-central1)
Free Tier: Trial Credits ($300, 91 days)


๐ŸŽฏ Service Overview

Vertex AI API provides enterprise-grade AI capabilities, supporting Gemini series models, particularly suitable for large-scale production deployments and integration with Google Cloud services.

Key Advantages:

  • ๐Ÿ’ฐ $300 Trial Credits - 91-day validity
  • ๐ŸŒŸ Gemini Series Models - Supports 1.5/2.x series, up to 2M super-long context
  • ๐Ÿข Enterprise Grade - Complete MLOps tools
  • ๐Ÿ” Security Compliance - Google Cloud standards
  • ๐ŸŒ Global Deployment - Multi-region availability
  • ๐Ÿ“Š Comprehensive Monitoring - Complete logs and metrics

๐Ÿš€ Quick Start

Prerequisites

Required:

  • โœ… Google Cloud account created
  • โœ… $300 trial credits activated
  • โœ… Vertex AI API enabled
  • โœ… Service account or API key created

For detailed steps, see: Vertex AI Registration Guide

5-Minute Quick Example

Install SDK:

Bash
pip install google-cloud-aiplatform

Usage Example:

Python
import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")

# Create model instance
model = GenerativeModel("gemini-1.5-pro")

# Generate content
response = model.generate_content("Explain what Vertex AI is")
print(response.text)

๐Ÿค– Supported Models

Gemini Pro Series

Model ID Example: gemini-1.5-pro or gemini-2.0-pro (check console for specific versions)

FeatureDetails
ContextUp to 2M tokens
MultimodalText, image, audio, video
Price$1.25-2.50/M (in), $5-10/M (out) (example)

Gemini Flash Series

Model ID Example: gemini-1.5-flash or gemini-2.0-flash (check console for specific versions)

FeatureDetails
ContextUp to 1M tokens
SpeedExtremely fast
Price$0.075-0.15/M (in), $0.30-0.60/M (out) (example)

๐Ÿ’ก Note: Google continuously updates model versions. Check official documentation for available models and pricing.


๐Ÿ“– API Usage Examples

1. Basic Conversation

Python
import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")

model = GenerativeModel("gemini-1.5-flash")

response = model.generate_content("Hello, introduce Gemini")
print(response.text)

2. Streaming Output

Python
model = GenerativeModel("gemini-1.5-flash")

response = model.generate_content(
    "Write an article about AI",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

3. Multimodal Input (Images)

Python
from vertexai.generative_models import GenerativeModel, Part

model = GenerativeModel("gemini-1.5-pro")

# Load image from local
image = Part.from_uri(
    "gs://your-bucket/image.jpg",
    mime_type="image/jpeg"
)

response = model.generate_content([
    "What's in this image?",
    image
])

print(response.text)

4. Long Context Processing

Python
# Process extremely long text (e.g., entire books)
with open("long_document.txt", "r") as f:
    long_text = f.read()  # Can be millions of words

model = GenerativeModel("gemini-1.5-pro")

response = model.generate_content(
    f"Summarize the core points of the following document:\n\n{long_text}"
)

print(response.text)

5. Function Calling

Python
from vertexai.generative_models import (
    GenerativeModel,
    Tool,
    FunctionDeclaration
)

# Define function
get_weather_func = FunctionDeclaration(
    name="get_weather",
    description="Get weather for specified city",
    parameters={
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "City name"
            }
        },
        "required": ["city"]
    }
)

# Create tool
weather_tool = Tool(function_declarations=[get_weather_func])

# Use
model = GenerativeModel(
    "gemini-1.5-pro",
    tools=[weather_tool]
)

response = model.generate_content("What's the weather in Shanghai today?")
print(response)

๐Ÿ”ข Costs and Quotas

Pricing

Gemini 1.5 Pro (example):

ContextInputOutput
โ‰ค 128K$1.25/M$5/M
> 128K$2.50/M$10/M

Gemini 1.5 Flash (example):

ContextInputOutput
โ‰ค 128K$0.075/M$0.30/M
> 128K$0.15/M$0.60/M

๐Ÿ’ก Note: Different model versions may have different pricing. Check official pricing page for details.

Trial Credits Usage Estimate

$300 estimate (for reference):

  • Gemini 1.5 Pro: Process approximately 240M input tokens
  • Gemini 1.5 Flash: Process approximately 4B input tokens
  • Or any combination

๐Ÿ’ก Best Practices

โœ… Recommended Practices

  1. Use Flash First

    # Flash is 16x cheaper, sufficient for most scenarios
    model = GenerativeModel("gemini-1.5-flash")
  2. Batch Processing

    # Batch process multiple requests
    responses = []
    for prompt in prompts:
        response = model.generate_content(prompt)
        responses.append(response.text)
  3. Cost Monitoring

    # Set budget alerts in GCP Console
    # Billing โ†’ Budgets & alerts
  4. Error Handling

    from google.api_core import exceptions
    
    try:
        response = model.generate_content(prompt)
    except exceptions.ResourceExhausted:
        print("Quota exhausted")
    except exceptions.InvalidArgument:
        print("Parameter error")

๐Ÿ”ง Common Issues

1. How to Authenticate?

Method:

Bash
# Set environment variable
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account-key.json"

# Or in code
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "key.json"

2. How to Choose Region?

Available Regions:

  • us-central1 (recommended)
  • europe-west1
  • asia-southeast1

3. How to Monitor Usage?

Method:

  1. GCP Console โ†’ Billing
  2. View Vertex AI usage
  3. Set budget alerts

๐Ÿ“š Related Resources

Official Documentation

SDKs and Tools


๐ŸŒŸ Practical Case

Case: Long Document Analysis System

Python
import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")

class DocumentAnalyzer:
    def __init__(self):
        self.model = GenerativeModel("gemini-1.5-pro")
    
    def analyze(self, document_text):
        """Analyze long document"""
        prompt = f"""
        Please analyze the following document and provide:
        1. Core theme
        2. Key points (5-10)
        3. Summary (within 200 words)
        
        Document content:
        {document_text}
        """
        
        response = self.model.generate_content(prompt)
        return response.text

# Usage
analyzer = DocumentAnalyzer()
with open("long_doc.txt") as f:
    result = analyzer.analyze(f.read())
    print(result)

Service Provider: Google Vertex AI

Last updated on