Design GCP architectures for startups and enterprises. Use when asked to design Google Cloud infrastructure, deploy to GKE or Cloud Run, configure BigQuery pipelines, optimize GCP costs, or migrate to
✓Works with OpenClaudeDesign scalable, cost-effective Google Cloud architectures for startups and enterprises with infrastructure-as-code templates.
Workflow
Step 1: Gather Requirements
Collect application specifications:
- Application type (web app, mobile backend, data pipeline, SaaS)
- Expected users and requests per second
- Budget constraints (monthly spend limit)
- Team size and GCP experience level
- Compliance requirements (GDPR, HIPAA, SOC 2)
- Availability requirements (SLA, RPO/RTO)
Step 2: Design Architecture
Run the architecture designer to get pattern recommendations:
python scripts/architecture_designer.py --input requirements.json
Example output:
{
"recommended_pattern": "serverless_web",
"service_stack": ["Cloud Storage", "Cloud CDN", "Cloud Run", "Firestore", "Identity Platform"],
"estimated_monthly_cost_usd": 30,
"pros": ["Low ops overhead", "Pay-per-use", "Auto-scaling", "No cold starts on Cloud Run min instances"],
"cons": ["Vendor lock-in", "Regional limitations", "Eventual consistency with Firestore"]
}
Select from recommended patterns:
- Serverless Web: Cloud Storage + Cloud CDN + Cloud Run + Firestore
- Microservices on GKE: GKE Autopilot + Cloud SQL + Memorystore + Cloud Pub/Sub
- Serverless Data Pipeline: Pub/Sub + Dataflow + BigQuery + Looker
- ML Platform: Vertex AI + Cloud Storage + BigQuery + Cloud Functions
See references/architecture_patterns.md for detailed pattern specifications.
Validation checkpoint: Confirm the recommended pattern matches the team's operational maturity and compliance requirements before proceeding to Step 3.
Step 3: Estimate Cost
Analyze estimated costs and optimization opportunities:
python scripts/cost_optimizer.py --resources current_setup.json --monthly-spend 2000
Example output:
{
"current_monthly_usd": 2000,
"recommendations": [
{ "action": "Right-size Cloud SQL db-custom-4-16384 to db-custom-2-8192", "savings_usd": 380, "priority": "high" },
{ "action": "Purchase 1-yr committed use discount for GKE nodes", "savings_usd": 290, "priority": "high" },
{ "action": "Move Cloud Storage objects >90 days to Nearline", "savings_usd": 75, "priority": "medium" }
],
"total_potential_savings_usd": 745
}
Output includes:
- Monthly cost breakdown by service
- Right-sizing recommendations
- Committed use discount opportunities
- Sustained use discount analysis
- Potential monthly savings
Use the GCP Pricing Calculator for detailed estimates.
Step 4: Generate IaC
Create infrastructure-as-code for the selected pattern:
python scripts/deployment_manager.py --app-name my-app --pattern serverless_web --region us-central1
Example Terraform HCL output (Cloud Run + Firestore):
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
provider "google" {
project = var.project_id
region = var.region
}
variable "project_id" {
description = "GCP project ID"
type = string
}
variable "region" {
description = "GCP region"
type = string
default = "us-central1"
}
resource "google_cloud_run_v2_service" "api" {
name = "${var.environment}-${var.app_name}-api"
location = var.region
template {
containers {
image = "gcr.io/${var.project_id}/${var.app_name}:latest"
resources {
limits = {
cpu = "1000m"
memory = "512Mi"
}
}
env {
name = "FIRESTORE_PROJECT"
value = var.project_id
}
}
scaling {
min_instance_count = 0
max_instance_count = 10
}
}
}
resource "google_firestore_database" "default" {
project = var.project_id
name = "(default)"
location_id = var.region
type = "FIRESTORE_NATIVE"
}
Example gcloud CLI deployment:
# Deploy Cloud Run service
gcloud run deploy my-app-api \
--image gcr.io/$PROJECT_ID/my-app:latest \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--memory 512Mi \
--cpu 1 \
--min-instances 0 \
--max-instances 10
# Create Firestore database
gcloud firestore databases create --location=us-central1
Full templates including Cloud CDN, Identity Platform, IAM, and Cloud Monitoring are generated by
deployment_manager.pyand also available inreferences/architecture_patterns.md.
Step 5: Configure CI/CD
Set up automated deployment with Cloud Build or GitHub Actions:
# cloudbuild.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA']
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'my-app-api'
- '--image=gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA'
- '--region=us-central1'
- '--platform=managed'
images:
- 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA'
# Connect repo and create trigger
gcloud builds triggers create github \
--repo-name=my-app \
--repo-owner=my-org \
--branch-pattern="^main$" \
--build-config=cloudbuild.yaml
Step 6: Security Review
Verify security configuration:
# Review IAM bindings
gcloud projects get-iam-policy $PROJECT_ID --format=json
# Check service account permissions
gcloud iam service-accounts list --project=$PROJECT_ID
# Verify VPC Service Controls (if applicable)
gcloud access-context-manager perimeters list --policy=$POLICY_ID
Security checklist:
- IAM roles follow least privilege (prefer predefined roles over basic roles)
- Service accounts use Workload Identity for GKE
- VPC Service Controls configured for sensitive APIs
- Cloud KMS encryption keys for customer-managed encryption
- Cloud Audit Logs enabled for all admin activity
- Organization policies restrict public access
- Secret Manager used for all credentials
If deployment fails:
- Check the failure reason:
gcloud run services describe my-app-api --region us-central1 gcloud logging read "resource.type=cloud_run_revision" --limit=20 - Review Cloud Logging for application errors.
- Fix the configuration or container image.
- Redeploy:
gcloud run deploy my-app-api --image gcr.io/$PROJECT_ID/my-app:latest --region us-central1
Common failure causes:
- IAM permission errors -- verify service account roles and
--allow-unauthenticatedflag - Quota exceeded -- request quota increase via IAM & Admin > Quotas
- Container startup failure -- check container logs and health check configuration
- Region not enabled -- enable the required APIs with
gcloud services enable
Tools
architecture_designer.py
Recommends GCP services based on workload requirements.
python scripts/architecture_designer.py --input requirements.json --output design.json
Input: JSON with app type, scale, budget, compliance needs Output: Recommended pattern, service stack, cost estimate, pros/cons
cost_optimizer.py
Analyzes GCP resources for cost savings.
python scripts/cost_optimizer.py --resources inventory.json --monthly-spend 5000
Output: Recommendations for:
- Idle resource removal
- Machine type right-sizing
- Committed use discounts
- Storage class transitions
- Network egress optimization
deployment_manager.py
Generates gcloud CLI deployment scripts and Terraform configurations.
python scripts/deployment_manager.py --app-name my-app --pattern serverless_web --region us-central1
Output: Production-ready deployment scripts with:
- Cloud Run or GKE deployment
- Firestore or Cloud SQL setup
- Identity Platform configuration
- IAM roles with least privilege
- Cloud Monitoring and Logging
Quick Start
Web App on Cloud Run (< $100/month)
Ask: "Design a serverless web backend for a mobile app with 1000 users"
Result:
- Cloud Run for API (auto-scaling, no cold start with min instances)
- Firestore for data (pay-per-operation)
- Identity Platform for authentication
- Cloud Storage + Cloud CDN for static assets
- Estimated: $15-40/month
Microservices on GKE ($500-2000/month)
Ask: "Design a scalable architecture for a SaaS platform with 50k users"
Result:
- GKE Autopilot for containerized workloads
- Cloud SQL (PostgreSQL) with read replicas
- Memorystore (Redis) for session caching
- Cloud CDN for global delivery
- Cloud Build for CI/CD
- Multi-zone deployment
Serverless Data Pipeline
Ask: "Design a real-time analytics pipeline for event data"
Result:
- Pub/Sub for event ingestion
- Dataflow (Apache Beam) for stream processing
- BigQuery for analytics and warehousing
- Looker for dashboards
- Cloud Functions for lightweight transforms
ML Platform
Ask: "Design a machine learning platform for model training and serving"
Result:
- Vertex AI for training and prediction
- Cloud Storage for datasets and model artifacts
- BigQuery for feature store
- Cloud Functions for preprocessing triggers
- Cloud Monitoring for model drift detection
Input Requirements
Provide these details for architecture design:
| Requirement | Description | Example |
|---|---|---|
| Application type | What you're building | SaaS platform, mobile backend |
| Expected scale | Users, requests/sec | 10k users, 100 RPS |
| Budget | Monthly GCP limit | $500/month max |
| Team context | Size, GCP experience | 3 devs, intermediate |
| Compliance | Regulatory needs | HIPAA, GDPR, SOC 2 |
| Availability | Uptime requirements | 99.9% SLA, 1hr RPO |
JSON Format:
{
"application_type": "saas_platform",
"expected_users": 10000,
"requests_per_second": 100,
"budget_monthly_usd": 500,
"team_size": 3,
"gcp_experience": "intermediate",
"compliance": ["SOC2"],
"availability_sla": "99.9%"
}
Output Formats
Architecture Design
- Pattern recommendation with rationale
- Service stack diagram (ASCII)
- Monthly cost estimate and trade-offs
IaC Templates
- Terraform HCL: Production-ready Google provider configs
- gcloud CLI: Scripted deployment commands
- Cloud Build YAML: CI/CD pipeline definitions
Cost Analysis
- Current spend breakdown with optimization recommendations
- Priority action list (high/medium/low) and implementation checklist
Anti-Patterns
| Anti-Pattern | Why It Fails | Better Approach |
|---|---|---|
| Using default VPC for production | No isolation, shared firewall rules | Create custom VPC with private subnets |
| Over-provisioning GKE node pools | Wasted cost on idle capacity | Use GKE Autopilot or cluster autoscaler |
| Storing secrets in environment variables | Visible in Cloud Console, logs | Use Secret Manager with Workload Identity |
| Ignoring sustained use discounts | Missing 20-30% automatic savings | Right-size VMs for consistent baseline usage |
| Single-region deployment for SaaS | One region outage = full downtime | Multi-region with Cloud Load Balancing |
| BigQuery on-demand for heavy workloads | Unpredictable costs at scale | Use BigQuery slots (flat-rate) for consistent workloads |
| Running Cloud Functions for long tasks | 9-minute timeout, cold starts | Use Cloud Run for tasks > 60 seconds |
Cross-References
| Skill | Relationship |
|---|---|
engineering-team/aws-solution-architect | AWS equivalent — same 6-step workflow, different services |
engineering-team/azure-cloud-architect | Azure equivalent — completes the cloud trifecta |
engineering-team/senior-devops | Broader DevOps scope — pipelines, monitoring, containerization |
engineering/terraform-patterns | IaC implementation — use for Terraform modules targeting GCP |
engineering/ci-cd-pipeline-builder | Pipeline construction — automates Cloud Build and deployment |
Reference Documentation
| Document | Contents |
|---|---|
references/architecture_patterns.md | 6 patterns: serverless, GKE microservices, three-tier, data pipeline, ML platform, multi-region |
references/service_selection.md | Decision matrices for compute, database, storage, messaging |
references/best_practices.md | Naming, labels, IAM, networking, monitoring, disaster recovery |
Related DevOps & CI/CD Skills
Other Claude Code skills in the same category — free to download.
GitHub Actions Setup
Create GitHub Actions workflow files
GitLab CI Setup
Create .gitlab-ci.yml pipeline configuration
Jenkins Pipeline
Generate Jenkinsfile for CI/CD
Deploy Script
Create deployment scripts for various platforms
Env Manager
Manage environment variables across environments
Infrastructure as Code
Generate Terraform/Pulumi configurations
Auto Release
Set up automated releases with semantic versioning
Rollback Script
Create rollback procedures and scripts
Want a DevOps & CI/CD skill personalized to YOUR project?
This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.