Deploy to GKE¶
GKE is Google Clouds managed Kubernetes service. It allows you to deploy and manage containerized applications using Kubernetes.
To deploy your agent you will need to have a Kubernetes cluster running on GKE. You can create a cluster using the Google Cloud Console or the gcloud
command line tool.
In this example we will deploy a simple agent to GKE. The agent will be a FastAPI application that uses Gemini 2.0 Flash
as the LLM. We can use Vertex AI or AI Studio as the LLM provider using a Environment variable.
Agent sample¶
For each of the commands, we will reference a capital_agent
sample defined in on the LLM agent page. We will assume it's in a capital_agent
directory.
To proceed, confirm that your agent code is configured as follows:
- Agent code is in a file called
agent.py
within your agent directory. - Your agent variable is named
root_agent
. __init__.py
is within your agent directory and containsfrom . import agent
.
Environment variables¶
Set your environment variables as described in the Setup and Installation guide. You also need to install the kubectl
command line tool. You can find instructions to do so in the Google Kubernetes Engine Documentation.
export GOOGLE_CLOUD_PROJECT=your-project-id # Your GCP project ID
export GOOGLE_CLOUD_LOCATION=us-central1 # Or your preferred location
export GOOGLE_GENAI_USE_VERTEXAI=true # Set to true if using Vertex AI
export GOOGLE_CLOUD_PROJECT_NUMBER=$(gcloud projects describe --format json $GOOGLE_CLOUD_PROJECT | jq -r ".projectNumber")
If you don't have jq
installed, you can use the following command to get the project number:
And copy the project number from the output.
Deployment commands¶
gcloud CLI¶
You can deploy your agent to GKE using the gcloud
and kubectl
cli and Kubernetes manifest files.
Ensure you have authenticated with Google Cloud (gcloud auth login
and gcloud config set project <your-project-id>
).
Enable APIs¶
Enable the necessary APIs for your project. You can do this using the gcloud
command line tool.
gcloud services enable \
container.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com \
aiplatform.googleapis.com
Create a GKE cluster¶
You can create a GKE cluster using the gcloud
command line tool. This example creates an Autopilot cluster named adk-cluster
in the us-central1
region.
If creating a GKE Standard cluster, make sure Workload Identity is enabled. Workload Identity is enabled by default in an AutoPilot cluster.
gcloud container clusters create-auto adk-cluster \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
After creating the cluster, you need to connect to it using kubectl
. This command configures kubectl
to use the credentials for your new cluster.
gcloud container clusters get-credentials adk-cluster \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
Project Structure¶
Organize your project files as follows:
your-project-directory/
├── capital_agent/
│ ├── __init__.py
│ └── agent.py # Your agent code (see "Agent sample" tab)
├── main.py # FastAPI application entry point
├── requirements.txt # Python dependencies
└── Dockerfile # Container build instructions
Create the following files (main.py
, requirements.txt
, Dockerfile
) in the root of your-project-directory/
.
Code files¶
-
This file sets up the FastAPI application using
get_fast_api_app()
from ADK:main.pyimport os import uvicorn from fastapi import FastAPI from google.adk.cli.fast_api import get_fast_api_app # Get the directory where main.py is located AGENT_DIR = os.path.dirname(os.path.abspath(__file__)) # Example session DB URL (e.g., SQLite) SESSION_DB_URL = "sqlite:///./sessions.db" # Example allowed origins for CORS ALLOWED_ORIGINS = ["http://localhost", "http://localhost:8080", "*"] # Set web=True if you intend to serve a web interface, False otherwise SERVE_WEB_INTERFACE = True # Call the function to get the FastAPI app instance # Ensure the agent directory name ('capital_agent') matches your agent folder app: FastAPI = get_fast_api_app( agent_dir=AGENT_DIR, session_db_url=SESSION_DB_URL, allow_origins=ALLOWED_ORIGINS, web=SERVE_WEB_INTERFACE, ) # You can add more FastAPI routes or configurations below if needed # Example: # @app.get("/hello") # async def read_root(): # return {"Hello": "World"} if __name__ == "__main__": # Use the PORT environment variable provided by Cloud Run, defaulting to 8080 uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
Note: We specify
agent_dir
to the directorymain.py
is in and useos.environ.get("PORT", 8080)
for Cloud Run compatibility. -
List the necessary Python packages:
-
Define the container image:
DockerfileFROM python:3.13-slim WORKDIR /app COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt RUN adduser --disabled-password --gecos "" myuser && \ chown -R myuser:myuser /app COPY . . USER myuser ENV PATH="/home/myuser/.local/bin:$PATH" CMD ["sh", "-c", "uvicorn main:app --host 0.0.0.0 --port $PORT"]
Build the container image¶
You need to create a Google Artifact Registry repository to store your container images. You can do this using the gcloud
command line tool.
gcloud artifacts repositories create adk-repo \
--repository-format=docker \
--location=$GOOGLE_CLOUD_LOCATION \
--description="ADK repository"
Build the container image using the gcloud
command line tool. This example builds the image and tags it as adk-repo/adk-agent:latest
.
gcloud builds submit \
--tag $GOOGLE_CLOUD_LOCATION-docker.pkg.dev/$GOOGLE_CLOUD_PROJECT/adk-repo/adk-agent:latest \
--project=$GOOGLE_CLOUD_PROJECT \
.
Verify the image is built and pushed to the Artifact Registry:
gcloud artifacts docker images list \
$GOOGLE_CLOUD_LOCATION-docker.pkg.dev/$GOOGLE_CLOUD_PROJECT/adk-repo \
--project=$GOOGLE_CLOUD_PROJECT
Configure Kubernetes Service Account for Vertex AI¶
If your agent uses Vertex AI, you need to create a Kubernetes service account with the necessary permissions. This example creates a service account named adk-agent-sa
and binds it to the Vertex AI User
role.
If you are using AI Studio and accessing the model with an API key you can skip this step.
gcloud projects add-iam-policy-binding projects/${GOOGLE_CLOUD_PROJECT} \
--role=roles/aiplatform.user \
--member=principal://iam.googleapis.com/projects/${GOOGLE_CLOUD_PROJECT_NUMBER}/locations/global/workloadIdentityPools/${GOOGLE_CLOUD_PROJECT}.svc.id.goog/subject/ns/default/sa/adk-agent-sa \
--condition=None
Create the Kubernetes manifest files¶
Create a Kubernetes deployment manifest file named deployment.yaml
in your project directory. This file defines how to deploy your application on GKE.
cat << EOF > deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: adk-agent
spec:
replicas: 1
selector:
matchLabels:
app: adk-agent
template:
metadata:
labels:
app: adk-agent
spec:
serviceAccount: adk-agent-sa
containers:
- name: adk-agent
imagePullPolicy: Always
image: $GOOGLE_CLOUD_LOCATION-docker.pkg.dev/$GOOGLE_CLOUD_PROJECT/adk-repo/adk-agent:latest
resources:
limits:
memory: "128Mi"
cpu: "500m"
ephemeral-storage: "128Mi"
requests:
memory: "128Mi"
cpu: "500m"
ephemeral-storage: "128Mi"
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
- name: GOOGLE_CLOUD_PROJECT
value: GOOGLE_CLOUD_PROJECT
- name: GOOGLE_CLOUD_LOCATION
value: GOOGLE_CLOUD_LOCATION
- name: GOOGLE_GENAI_USE_VERTEXAI
value: GOOGLE_GENAI_USE_VERTEXAI
# If using AI Studio, set GOOGLE_GENAI_USE_VERTEXAI to false and set the following:
# - name: GOOGLE_API_KEY
# value: GOOGLE_API_KEY
# Add any other necessary environment variables your agent might need
---
apiVersion: v1
kind: Service
metadata:
name: adk-agent
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: adk-agent
EOF
Deploy the Application¶
Deploy the application using the kubectl
command line tool. This command applies the deployment and service manifest files to your GKE cluster.
After a few moments, you can check the status of your deployment using:
This command lists the pods associated with your deployment. You should see a pod with a status of Running
.
Once the pod is running, you can check the status of the service using:
If the output shows a External IP
, it means your service is accessible from the internet. It may take a few minutes for the external IP to be assigned.
You can get the external IP address of your service using:
Testing your agent¶
Once your agent is deployed to GKE, you can interact with it via the deployed UI (if enabled) or directly with its API endpoints using tools like curl
. You'll need the service URL provided after deployment.
UI Testing¶
If you deployed your agent with the UI enabled:
You can test your agent by simply navigating to the kubernetes service URL in your web browser.
The ADK dev UI allows you to interact with your agent, manage sessions, and view execution details directly in the browser.
To verify your agent is working as intended, you can:
- Select your agent from the dropdown menu.
- Type a message and verify that you receive an expected response from your agent.
If you experience any unexpected behavior, check the pod logs for your agent using:
API Testing (curl)¶
You can interact with the agent's API endpoints using tools like curl
. This is useful for programmatic interaction or if you deployed without the UI.
Set the application URL¶
Replace the example URL with the actual URL of your deployed Cloud Run service.
List available apps¶
Verify the deployed application name.
(Adjust the app_name
in the following commands based on this output if needed. The default is often the agent directory name, e.g., capital_agent
).
Create or Update a Session¶
Initialize or update the state for a specific user and session. Replace capital_agent
with your actual app name if different. The values user_123
and session_abc
are example identifiers; you can replace them with your desired user and session IDs.
curl -X POST \
$APP_URL/apps/capital_agent/users/user_123/sessions/session_abc \
-H "Content-Type: application/json" \
-d '{"state": {"preferred_language": "English", "visit_count": 5}}'
Run the Agent¶
Send a prompt to your agent. Replace capital_agent
with your app name and adjust the user/session IDs and prompt as needed.
curl -X POST $APP_URL/run_sse \
-H "Content-Type: application/json" \
-d '{
"app_name": "capital_agent",
"user_id": "user_123",
"session_id": "session_abc",
"new_message": {
"role": "user",
"parts": [{
"text": "What is the capital of Canada?"
}]
},
"streaming": false
}'
- Set
"streaming": true
if you want to receive Server-Sent Events (SSE). - The response will contain the agent's execution events, including the final answer.
Troubleshooting¶
These are some common issues you might encounter when deploying your agent to GKE:
403 Permission Denied for Gemini 2.0 Flash
¶
This usually means that the Kubernetes service account does not have the necessary permission to access the Vertex AI API. Ensure that you have created the service account and bound it to the Vertex AI User
role as described in the Configure Kubernetes Service Account for Vertex AI section. If you are using AI Studio, ensure that you have set the GOOGLE_API_KEY
environment variable in the deployment manifest and it is valid.
Attempt to write a readonly database¶
You might see there is no session id created in the UI and the agent does not respond to any messages. This is usually caused by the SQLite database being read-only. This can happen if you run the agent locally and then create the container image which copies the SQLite database into the container. The database is then read-only in the container.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) attempt to write a readonly database
[SQL: UPDATE app_states SET state=?, update_time=CURRENT_TIMESTAMP WHERE app_states.app_name = ?]
To fix this issue, you can either:
Delete the SQLite database file from your local machine before building the container image. This will create a new SQLite database when the container is started.
or (recommended) you can add a .dockerignore
file to your project directory to exclude the SQLite database from being copied into the container image.
Build the container image abd deploy the application again.
Cleanup¶
To delete the GKE cluster and all associated resources, run:
gcloud container clusters delete adk-cluster \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
To delete the Artifact Registry repository, run:
gcloud artifacts repositories delete adk-repo \
--location=$GOOGLE_CLOUD_LOCATION \
--project=$GOOGLE_CLOUD_PROJECT
You can also delete the project if you no longer need it. This will delete all resources associated with the project, including the GKE cluster, Artifact Registry repository, and any other resources you created.