Senior Cloud Engineer GCP Interview Questions: Complete Guide

Milad Bonakdar
Author
Master advanced GCP concepts with comprehensive interview questions covering architecture design, GKE, Cloud Functions, cost optimization, and security for senior cloud engineer roles.
Introduction
Senior GCP cloud engineers are expected to design scalable architectures, implement advanced services, optimize costs, and ensure security at scale. This role requires deep expertise in GCP services, architectural best practices, and production experience.
This guide covers essential interview questions for senior GCP cloud engineers, focusing on architecture, advanced services, and strategic solutions.
Architecture & Design
1. Design a highly available application on GCP.
Answer: Production-ready architecture with redundancy and scalability:
Key Components:
# Create managed instance group with autoscaling
gcloud compute instance-groups managed create my-mig \
--base-instance-name=my-app \
--template=my-template \
--size=3 \
--zone=us-central1-a
# Configure autoscaling
gcloud compute instance-groups managed set-autoscaling my-mig \
--max-num-replicas=10 \
--min-num-replicas=3 \
--target-cpu-utilization=0.7 \
--cool-down-period=90
# Create load balancer
gcloud compute backend-services create my-backend \
--protocol=HTTP \
--health-checks=my-health-check \
--global
# Add instance group to backend
gcloud compute backend-services add-backend my-backend \
--instance-group=my-mig \
--instance-group-zone=us-central1-a \
--globalDesign Principles:
- Multi-zone deployment
- Auto-scaling based on metrics
- Managed services for databases
- CDN for static content
- Health checks and monitoring
Rarity: Very Common
Difficulty: Hard
Google Kubernetes Engine (GKE)
2. How do you deploy and manage applications on GKE?
Answer: GKE is Google's managed Kubernetes service.
Deployment Process:
# Create GKE cluster
gcloud container clusters create my-cluster \
--num-nodes=3 \
--machine-type=e2-medium \
--zone=us-central1-a \
--enable-autoscaling \
--min-nodes=3 \
--max-nodes=10
# Get credentials
gcloud container clusters get-credentials my-cluster \
--zone=us-central1-a
# Deploy application
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: gcr.io/my-project/myapp:v1
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- port: 80
targetPort: 8080
EOFGKE Features:
- Auto-upgrade and auto-repair
- Workload Identity for security
- Binary Authorization
- Cloud Monitoring integration
Rarity: Very Common
Difficulty: Hard
Serverless & Advanced Services
3. When would you use Cloud Functions vs Cloud Run?
Answer: Choose based on workload characteristics:
Cloud Functions:
- Event-driven (Pub/Sub, Storage, HTTP)
- Short-running (< 9 minutes)
- Automatic scaling to zero
- Pay per invocation
Cloud Run:
- Container-based
- HTTP requests or Pub/Sub
- Longer running (up to 60 minutes)
- More control over environment
# Cloud Function example
def hello_pubsub(event, context):
"""Triggered by Pub/Sub message"""
import base64
if 'data' in event:
message = base64.b64decode(event['data']).decode('utf-8')
print(f'Received message: {message}')
# Process message
process_data(message)# Deploy Cloud Function
gcloud functions deploy hello_pubsub \
--runtime=python39 \
--trigger-topic=my-topic \
--entry-point=hello_pubsub
# Deploy Cloud Run
gcloud run deploy myapp \
--image=gcr.io/my-project/myapp:v1 \
--platform=managed \
--region=us-central1 \
--allow-unauthenticatedRarity: Common
Difficulty: Medium
Advanced Networking
4. Explain Shared VPC and when to use it.
Answer: Shared VPC allows multiple projects to share a common VPC network.
Benefits:
- Centralized network administration
- Resource sharing across projects
- Simplified billing
- Consistent security policies
Architecture:
# Enable Shared VPC in host project
gcloud compute shared-vpc enable my-host-project
# Attach service project
gcloud compute shared-vpc associated-projects add my-service-project \
--host-project=my-host-project
# Grant permissions
gcloud projects add-iam-policy-binding my-host-project \
--member=serviceAccount:service-123@compute-system.iam.gserviceaccount.com \
--role=roles/compute.networkUserUse Cases:
- Large organizations
- Multi-team environments
- Centralized network management
- Compliance requirements
Rarity: Common
Difficulty: Medium-Hard
Cost Optimization
5. How do you optimize GCP costs?
Answer: Cost optimization strategies:
1. Right-sizing:
# Use Recommender API
gcloud recommender recommendations list \
--project=my-project \
--location=us-central1 \
--recommender=google.compute.instance.MachineTypeRecommender2. Committed Use Discounts:
- 1-year or 3-year commitments
- Up to 57% savings
- Flexible or resource-based
3. Preemptible VMs:
# Create preemptible instance
gcloud compute instances create my-preemptible \
--preemptible \
--machine-type=e2-medium4. Storage Lifecycle:
# Set lifecycle policy
gsutil lifecycle set lifecycle.json gs://my-bucket
# lifecycle.json
{
"lifecycle": {
"rule": [
{
"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
"condition": {"age": 30}
},
{
"action": {"type": "Delete"},
"condition": {"age": 365}
}
]
}
}5. Monitoring:
- Cloud Billing reports
- Budget alerts
- Cost breakdown by service/project
Rarity: Very Common
Difficulty: Medium
Security
6. How do you implement security best practices in GCP?
Answer: Multi-layered security approach:
1. IAM Best Practices:
# Use service accounts with minimal permissions
gcloud iam service-accounts create my-app-sa \
--display-name="My App Service Account"
# Grant specific role
gcloud projects add-iam-policy-binding my-project \
--member=serviceAccount:my-app-sa@my-project.iam.gserviceaccount.com \
--role=roles/storage.objectViewer \
--condition='expression=resource.name.startsWith("projects/_/buckets/my-bucket"),title=bucket-access'2. VPC Security:
- Private Google Access
- VPC Service Controls
- Cloud Armor for DDoS protection
3. Data Encryption:
# Customer-managed encryption keys
gcloud kms keyrings create my-keyring \
--location=global
gcloud kms keys create my-key \
--location=global \
--keyring=my-keyring \
--purpose=encryption
# Use with Cloud Storage
gsutil -o 'GSUtil:encryption_key=...' cp file.txt gs://my-bucket/4. Monitoring:
- Cloud Audit Logs
- Security Command Center
- Cloud Logging and Monitoring
Rarity: Very Common
Difficulty: Hard
Data Analytics
7. How do you design and optimize BigQuery for large-scale analytics?
Answer: BigQuery is Google's serverless, highly scalable data warehouse.
Architecture:
- Columnar storage
- Automatic scaling
- SQL interface
- Petabyte-scale
- Pay-per-query pricing
Table Design:
-- Create partitioned table
CREATE TABLE mydataset.events
(
event_id STRING,
user_id STRING,
event_type STRING,
event_data JSON,
event_timestamp TIMESTAMP
)
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
OPTIONS(
partition_expiration_days=90,
require_partition_filter=true
);
-- Create materialized view
CREATE MATERIALIZED VIEW mydataset.daily_summary
AS
SELECT
DATE(event_timestamp) as event_date,
event_type,
COUNT(*) as event_count,
COUNT(DISTINCT user_id) as unique_users
FROM mydataset.events
GROUP BY event_date, event_type;Optimization Strategies:
1. Partitioning:
-- Time-based partitioning
CREATE TABLE mydataset.sales
PARTITION BY DATE(sale_date)
AS SELECT * FROM source_table;
-- Integer range partitioning
CREATE TABLE mydataset.user_data
PARTITION BY RANGE_BUCKET(user_id, GENERATE_ARRAY(0, 1000000, 10000))
AS SELECT * FROM source_table;
-- Query with partition filter (cost-effective)
SELECT *
FROM mydataset.events
WHERE DATE(event_timestamp) BETWEEN '2024-01-01' AND '2024-01-31'
AND event_type = 'purchase';2. Clustering:
-- Cluster by frequently filtered columns
CREATE TABLE mydataset.logs
PARTITION BY DATE(log_timestamp)
CLUSTER BY user_id, region, status
AS SELECT * FROM source_logs;
-- Queries benefit from clustering
SELECT *
FROM mydataset.logs
WHERE DATE(log_timestamp) = '2024-11-26'
AND user_id = '12345'
AND region = 'us-east1';3. Query Optimization:
-- Bad: SELECT * (scans all columns)
SELECT * FROM mydataset.large_table;
-- Good: SELECT specific columns
SELECT user_id, event_type, event_timestamp
FROM mydataset.large_table;
-- Use approximate aggregation for large datasets
SELECT APPROX_COUNT_DISTINCT(user_id) as unique_users
FROM mydataset.events;
-- Avoid self-joins, use window functions
SELECT
user_id,
event_timestamp,
LAG(event_timestamp) OVER (PARTITION BY user_id ORDER BY event_timestamp) as prev_event
FROM mydataset.events;4. Cost Control:
# Set maximum bytes billed
bq query \
--maximum_bytes_billed=1000000000 \
--use_legacy_sql=false \
'SELECT COUNT(*) FROM mydataset.large_table'
# Dry run to estimate costs
bq query \
--dry_run \
--use_legacy_sql=false \
'SELECT * FROM mydataset.large_table'Data Loading:
# Load from Cloud Storage
bq load \
--source_format=NEWLINE_DELIMITED_JSON \
--autodetect \
mydataset.mytable \
gs://mybucket/data/*.json
# Load with schema
bq load \
--source_format=CSV \
--skip_leading_rows=1 \
mydataset.mytable \
gs://mybucket/data.csv \
schema.json
# Streaming inserts (real-time)
from google.cloud import bigquery
client = bigquery.Client()
table_id = "my-project.mydataset.mytable"
rows_to_insert = [
{"user_id": "123", "event_type": "click", "timestamp": "2024-11-26T10:00:00"},
{"user_id": "456", "event_type": "purchase", "timestamp": "2024-11-26T10:05:00"},
]
errors = client.insert_rows_json(table_id, rows_to_insert)
if errors:
print(f"Errors: {errors}")Best Practices:
- Always use partition filters
- Cluster by high-cardinality columns
- Avoid SELECT *
- Use approximate functions for large datasets
- Monitor query costs
- Use materialized views for repeated queries
- Denormalize data when appropriate
Rarity: Very Common
Difficulty: Hard
Advanced Database Services
8. When would you use Cloud Spanner vs Cloud SQL?
Answer: Choose based on scale, consistency, and geographic requirements:
Cloud Spanner:
- Globally distributed relational database
- Horizontal scaling (unlimited)
- Strong consistency across regions
- 99.999% availability SLA
- Higher cost
Cloud SQL:
- Regional managed database (MySQL, PostgreSQL, SQL Server)
- Vertical scaling (limited)
- Single-region (with read replicas)
- 99.95% availability SLA
- Lower cost
Comparison:
| Feature | Cloud Spanner | Cloud SQL |
|---|---|---|
| Scale | Petabytes | Terabytes |
| Consistency | Global strong | Regional |
| Availability | 99.999% | 99.95% |
| Latency | Single-digit ms globally | Low (regional) |
| Cost | High | Moderate |
| Use Case | Global apps, financial systems | Regional apps, traditional workloads |
Cloud Spanner Example:
-- Create Spanner instance
gcloud spanner instances create my-instance \
--config=regional-us-central1 \
--nodes=3 \
--description="Production instance"
-- Create database
gcloud spanner databases create my-database \
--instance=my-instance \
--ddl='CREATE TABLE Users (
UserId INT64 NOT NULL,
Username STRING(100),
Email STRING(255),
CreatedAt TIMESTAMP
) PRIMARY KEY (UserId)'
-- Insert data
gcloud spanner databases execute-sql my-database \
--instance=my-instance \
--sql="INSERT INTO Users (UserId, Username, Email, CreatedAt)
VALUES (1, 'alice', 'alice@example.com', CURRENT_TIMESTAMP())"
-- Query with strong consistency
gcloud spanner databases execute-sql my-database \
--instance=my-instance \
--sql="SELECT * FROM Users WHERE UserId = 1"Python Client:
from google.cloud import spanner
# Create client
spanner_client = spanner.Client()
instance = spanner_client.instance('my-instance')
database = instance.database('my-database')
# Read with strong consistency
def read_user(user_id):
with database.snapshot() as snapshot:
results = snapshot.execute_sql(
"SELECT UserId, Username, Email FROM Users WHERE UserId = @user_id",
params={"user_id": user_id},
param_types={"user_id": spanner.param_types.INT64}
)
for row in results:
print(f"User: {row[0]}, {row[1]}, {row[2]}")
# Write with transaction
def create_user(user_id, username, email):
def insert_user(transaction):
transaction.execute_update(
"INSERT INTO Users (UserId, Username, Email, CreatedAt) "
"VALUES (@user_id, @username, @email, CURRENT_TIMESTAMP())",
params={
"user_id": user_id,
"username": username,
"email": email
},
param_types={
"user_id": spanner.param_types.INT64,
"username": spanner.param_types.STRING,
"email": spanner.param_types.STRING
}
)
database.run_in_transaction(insert_user)Cloud SQL Example:
# Create Cloud SQL instance
gcloud sql instances create my-instance \
--database-version=POSTGRES_14 \
--tier=db-n1-standard-2 \
--region=us-central1 \
--root-password=mypassword
# Create database
gcloud sql databases create mydatabase \
--instance=my-instance
# Connect
gcloud sql connect my-instance --user=postgres
# Create read replica
gcloud sql instances create my-replica \
--master-instance-name=my-instance \
--tier=db-n1-standard-1 \
--region=us-east1When to Use:
Use Cloud Spanner when:
- Need global distribution
- Require strong consistency across regions
- Scale beyond single region
- Financial transactions
- Mission-critical applications
- Budget allows for higher cost
Use Cloud SQL when:
- Regional application
- Familiar with MySQL/PostgreSQL
- Cost-sensitive
- Moderate scale (< 10TB)
- Existing SQL workloads
- Don't need global consistency
Rarity: Common
Difficulty: Medium-Hard
Security & Compliance
9. How do you implement VPC Service Controls?
Answer: VPC Service Controls create security perimeters around GCP resources to prevent data exfiltration.
Key Concepts:
- Service Perimeter: Boundary around resources
- Access Levels: Conditions for access
- Ingress/Egress Rules: Control data flow
Architecture:
Setup:
# Create access policy
gcloud access-context-manager policies create \
--organization=123456789 \
--title="Production Policy"
# Create access level
gcloud access-context-manager levels create CorpNetwork \
--policy=accessPolicies/123456789 \
--title="Corporate Network" \
--basic-level-spec=access_level.yaml
# access_level.yaml
conditions:
- ipSubnetworks:
- 203.0.113.0/24 # Corporate IP range
- members:
- user:admin@example.comCreate Service Perimeter:
# Create perimeter
gcloud access-context-manager perimeters create production_perimeter \
--policy=accessPolicies/123456789 \
--title="Production Perimeter" \
--resources=projects/123456789012 \
--restricted-services=storage.googleapis.com,bigquery.googleapis.com \
--access-levels=accessPolicies/123456789/accessLevels/CorpNetwork
# Add project to perimeter
gcloud access-context-manager perimeters update production_perimeter \
--policy=accessPolicies/123456789 \
--add-resources=projects/987654321098Ingress/Egress Rules:
# ingress_rule.yaml
ingressPolicies:
- ingressFrom:
sources:
- accessLevel: accessPolicies/123456789/accessLevels/CorpNetwork
identities:
- serviceAccount:data-pipeline@project.iam.gserviceaccount.com
ingressTo:
resources:
- '*'
operations:
- serviceName: storage.googleapis.com
methodSelectors:
- method: '*'
# Apply ingress rule
gcloud access-context-manager perimeters update production_perimeter \
--policy=accessPolicies/123456789 \
--set-ingress-policies=ingress_rule.yamlEgress Rules:
# egress_rule.yaml
egressPolicies:
- egressFrom:
identities:
- serviceAccount:export-service@project.iam.gserviceaccount.com
egressTo:
resources:
- projects/external-project-id
operations:
- serviceName: storage.googleapis.com
methodSelectors:
- method: 'google.storage.objects.create'
# Apply egress rule
gcloud access-context-manager perimeters update production_perimeter \
--policy=accessPolicies/123456789 \
--set-egress-policies=egress_rule.yamlSupported Services:
- Cloud Storage
- BigQuery
- Cloud SQL
- Compute Engine
- GKE
- Cloud Functions
- And many more
Testing:
# Test access from within perimeter
from google.cloud import storage
def test_access():
try:
client = storage.Client()
bucket = client.bucket('my-protected-bucket')
blobs = list(bucket.list_blobs())
print(f"Access granted: {len(blobs)} objects")
except Exception as e:
print(f"Access denied: {e}")
# This will succeed from authorized network
# This will fail from unauthorized network
test_access()Monitoring:
# View VPC SC logs
gcloud logging read \
'protoPayload.metadata.@type="type.googleapis.com/google.cloud.audit.VpcServiceControlAuditMetadata"' \
--limit=50 \
--format=jsonUse Cases:
- Prevent data exfiltration
- Compliance requirements (HIPAA, PCI-DSS)
- Protect sensitive data
- Isolate production environments
- Multi-tenant security
Best Practices:
- Start with dry-run mode
- Test thoroughly before enforcement
- Use access levels for fine-grained control
- Monitor VPC SC logs
- Document perimeter boundaries
- Regular access reviews
Rarity: Uncommon
Difficulty: Hard
Conclusion
Senior GCP cloud engineer interviews require deep technical knowledge and practical experience. Focus on:
- Architecture: High availability, scalability, disaster recovery
- GKE: Container orchestration, deployment strategies
- Serverless: Cloud Functions, Cloud Run use cases
- Networking: Shared VPC, hybrid connectivity
- Cost Optimization: Right-sizing, committed use, lifecycle policies
- Security: IAM, encryption, VPC controls
Demonstrate real-world experience with production systems and strategic decision-making. Good luck!





