Jet Li 13acfccd77 (feat): Basic docs on Freeleaps Infra

2025-09-04 00:58:59 -07:00

17 KiB

Raw Blame History

Kubernetes Fundamentals for Junior Engineers

🎯 Overview

This guide is designed for junior engineers starting their DevOps journey. It covers the essential Kubernetes concepts you'll encounter daily, with practical examples and real-world scenarios.

📋 Prerequisites

Before diving into these concepts, make sure you understand:

✅ Pods: Basic container units
✅ Namespaces: Resource organization
✅ PVCs: Persistent storage
✅ Basic kubectl commands

🚀 1. Deployments (The Right Way to Run Apps)

Why Deployments?

Never create Pods directly! Deployments are the standard way to run applications because they provide:

Replicas: Run multiple copies of your app
Rolling updates: Zero-downtime deployments
Rollback: Easy recovery from failed deployments
Self-healing: Automatically restart failed pods

Deployment Structure

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: my-app
  labels:
    app: web-app
    version: v1
spec:
  replicas: 3  # Run 3 copies
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
        version: v1
    spec:
      containers:
      - name: web-app
        image: nginx:latest
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"

Managing Deployments

# Create deployment
kubectl apply -f deployment.yaml

# Check deployment status
kubectl get deployments
kubectl describe deployment web-app

# Scale deployment
kubectl scale deployment web-app --replicas=5

# Update deployment (change image)
kubectl set image deployment/web-app web-app=nginx:1.21

# Rollback to previous version
kubectl rollout undo deployment/web-app

# Check rollout status
kubectl rollout status deployment/web-app

# View rollout history
kubectl rollout history deployment/web-app

🌐 2. Services (Network Communication)

Why Services?

Pods are ephemeral (temporary). Services provide:

Stable IP addresses for your applications
Load balancing across multiple pods
Service discovery within the cluster
External access to your applications

Service Types

ClusterIP (Internal Access)

apiVersion: v1
kind: Service
metadata:
  name: web-app-service
  namespace: my-app
spec:
  type: ClusterIP
  selector:
    app: web-app
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP

NodePort (External Access via Node)

apiVersion: v1
kind: Service
metadata:
  name: web-app-nodeport
  namespace: my-app
spec:
  type: NodePort
  selector:
    app: web-app
  ports:
  - port: 80
    targetPort: 80
    nodePort: 30080  # Access via node IP:30080
    protocol: TCP

LoadBalancer (Cloud Load Balancer)

apiVersion: v1
kind: Service
metadata:
  name: web-app-lb
  namespace: my-app
spec:
  type: LoadBalancer
  selector:
    app: web-app
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP

Managing Services

# Create service
kubectl apply -f service.yaml

# List services
kubectl get services
kubectl get svc

# Get service details
kubectl describe service web-app-service

# Test service connectivity
kubectl run test-pod --image=busybox --rm -it --restart=Never -- wget -O- web-app-service:80

# Port forward for testing
kubectl port-forward service/web-app-service 8080:80

🔧 3. ConfigMaps & Secrets (Configuration Management)

Why ConfigMaps & Secrets?

Applications need configuration. These provide:

Environment-specific settings (dev, staging, prod)
Secure credential storage
Configuration without rebuilding images
Centralized configuration management

ConfigMaps

Creating ConfigMaps

# From literal values
kubectl create configmap app-config \
  --from-literal=DB_HOST=postgres-service \
  --from-literal=DB_PORT=5432 \
  --from-literal=ENVIRONMENT=production

# From file
kubectl create configmap app-config --from-file=config.properties

# From YAML
kubectl apply -f configmap.yaml

ConfigMap YAML

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: my-app
data:
  # Simple key-value pairs
  DB_HOST: "postgres-service"
  DB_PORT: "5432"
  ENVIRONMENT: "production"
  
  # File-like content
  config.properties: |
    server.port=8080
    logging.level=INFO
    cache.enabled=true

Using ConfigMaps in Pods

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  template:
    spec:
      containers:
      - name: web-app
        image: my-app:latest
        env:
        # Environment variables
        - name: DB_HOST
          valueFrom:
            configMapKeyRef:
              name: app-config
              key: DB_HOST
        - name: DB_PORT
          valueFrom:
            configMapKeyRef:
              name: app-config
              key: DB_PORT
        volumeMounts:
        # Mount as files
        - name: config-volume
          mountPath: /app/config
      volumes:
      - name: config-volume
        configMap:
          name: app-config

Secrets

Creating Secrets

# From literal values
kubectl create secret generic db-secret \
  --from-literal=DB_USERNAME=admin \
  --from-literal=DB_PASSWORD=secret123

# From file
kubectl create secret generic tls-secret \
  --from-file=tls.crt=cert.pem \
  --from-file=tls.key=key.pem

Secret YAML

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
  namespace: my-app
type: Opaque
data:
  # Base64 encoded values
  DB_USERNAME: YWRtaW4=  # admin
  DB_PASSWORD: c2VjcmV0MTIz  # secret123

Using Secrets in Pods

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  template:
    spec:
      containers:
      - name: web-app
        image: my-app:latest
        env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: DB_USERNAME
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: DB_PASSWORD

🎯 4. Ingress (External Access & Routing)

Why Ingress?

Ingress provides:

URL-based routing (example.com/api, example.com/web)
SSL/TLS termination
Load balancing
Name-based virtual hosting

Ingress Controller

First, ensure you have an Ingress controller (like nginx-ingress):

# Check if ingress controller exists
kubectl get pods -n ingress-nginx

# If not, install nginx-ingress
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.2/deploy/static/provider/cloud/deploy.yaml

Ingress Resource

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app-ingress
  namespace: my-app
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - myapp.example.com
    secretName: myapp-tls
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-app-service
            port:
              number: 80
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8080

Managing Ingress

# Apply ingress
kubectl apply -f ingress.yaml

# Check ingress status
kubectl get ingress
kubectl describe ingress web-app-ingress

# Test ingress
curl -H "Host: myapp.example.com" http://your-cluster-ip/

🔄 5. Jobs & CronJobs (Batch Processing)

Why Jobs & CronJobs?

For tasks that need to:

Run to completion (not continuously)
Execute on schedule (daily backups, reports)
Process data (ETL jobs, batch processing)

Jobs

apiVersion: batch/v1
kind: Job
metadata:
  name: data-processing-job
  namespace: my-app
spec:
  completions: 3  # Run 3 times
  parallelism: 2  # Run 2 in parallel
  template:
    spec:
      containers:
      - name: data-processor
        image: data-processor:latest
        command: ["python", "process_data.py"]
        env:
        - name: INPUT_FILE
          value: "/data/input.csv"
        - name: OUTPUT_FILE
          value: "/data/output.csv"
        volumeMounts:
        - name: data-volume
          mountPath: /data
      volumes:
      - name: data-volume
        persistentVolumeClaim:
          claimName: data-pvc
      restartPolicy: Never

CronJobs

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
  namespace: my-app
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:latest
            command: ["/bin/bash", "-c"]
            args:
            - |
              echo "Starting backup at $(date)"
              pg_dump -h postgres-service -U admin mydb > /backup/backup-$(date +%Y%m%d).sql
              echo "Backup completed at $(date)"              
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-secret
                  key: DB_PASSWORD
            volumeMounts:
            - name: backup-volume
              mountPath: /backup
          volumes:
          - name: backup-volume
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

Managing Jobs & CronJobs

# Create job
kubectl apply -f job.yaml

# Check job status
kubectl get jobs
kubectl describe job data-processing-job

# View job logs
kubectl logs job/data-processing-job

# Create cronjob
kubectl apply -f cronjob.yaml

# Check cronjob status
kubectl get cronjobs
kubectl describe cronjob daily-backup

# Suspend cronjob
kubectl patch cronjob daily-backup -p '{"spec" : {"suspend" : true}}'

# Resume cronjob
kubectl patch cronjob daily-backup -p '{"spec" : {"suspend" : false}}'

📊 6. Resource Management & Limits

Why Resource Management?

To prevent:

Resource starvation (one app consuming all CPU/memory)
Node failures (out of memory)
Poor performance (over-subscription)

Resource Requests & Limits

apiVersion: apps/v1
kind: Deployment
metadata:
  name: resource-managed-app
spec:
  template:
    spec:
      containers:
      - name: app
        image: my-app:latest
        resources:
          requests:
            memory: "64Mi"    # Minimum guaranteed
            cpu: "250m"       # 0.25 CPU cores
          limits:
            memory: "128Mi"   # Maximum allowed
            cpu: "500m"       # 0.5 CPU cores

Resource Quotas

apiVersion: v1
kind: ResourceQuota
metadata:
  name: namespace-quota
  namespace: my-app
spec:
  hard:
    requests.cpu: "4"        # 4 CPU cores total
    requests.memory: 8Gi     # 8GB memory total
    limits.cpu: "8"          # 8 CPU cores max
    limits.memory: 16Gi      # 16GB memory max
    pods: "20"               # 20 pods max
    services: "10"           # 10 services max
    persistentvolumeclaims: "10"  # 10 PVCs max

Managing Resources

# Check resource usage
kubectl top pods
kubectl top nodes

# Check quotas
kubectl get resourcequota
kubectl describe resourcequota namespace-quota

# Check resource requests/limits
kubectl describe pod <pod-name> | grep -A 10 "Limits"

🔍 7. Monitoring & Debugging

Essential Commands

# Check cluster health
kubectl get nodes
kubectl get pods --all-namespaces

# Check specific resources
kubectl get deployments,services,pods -n my-app

# View logs
kubectl logs <pod-name>
kubectl logs <pod-name> -f  # Follow logs
kubectl logs <pod-name> --previous  # Previous container

# Execute commands in pods
kubectl exec -it <pod-name> -- /bin/bash
kubectl exec <pod-name> -- ls /app

# Port forwarding for debugging
kubectl port-forward <pod-name> 8080:80
kubectl port-forward service/<service-name> 8080:80

# Check events
kubectl get events --sort-by='.lastTimestamp'
kubectl get events -n my-app

# Check resource usage
kubectl top pods
kubectl top nodes

Common Debugging Scenarios

Pod Stuck in Pending

# Check why pod can't be scheduled
kubectl describe pod <pod-name>

# Check node resources
kubectl describe node <node-name>

# Check events
kubectl get events --sort-by='.lastTimestamp'

Pod Crashing

# Check pod status
kubectl get pods
kubectl describe pod <pod-name>

# Check logs
kubectl logs <pod-name>
kubectl logs <pod-name> --previous

# Check resource usage
kubectl top pod <pod-name>

Service Not Working

# Check service endpoints
kubectl get endpoints <service-name>

# Check service configuration
kubectl describe service <service-name>

# Test service connectivity
kubectl run test-pod --image=busybox --rm -it --restart=Never -- wget -O- <service-name>:<port>

🔒 8. Security Best Practices

Pod Security

apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
spec:
  template:
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 2000
      containers:
      - name: app
        image: my-app:latest
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: tmp-volume
          mountPath: /tmp
      volumes:
      - name: tmp-volume
        emptyDir: {}

Network Policies

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
  namespace: my-app
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-web-traffic
  namespace: my-app
spec:
  podSelector:
    matchLabels:
      app: web-app
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: frontend
    ports:
    - protocol: TCP
      port: 80

📚 9. Best Practices for Junior Engineers

1. Always Use Deployments (Not Pods)

# ❌ Don't do this
kubectl run nginx --image=nginx

# ✅ Do this
kubectl create deployment nginx --image=nginx

2. Use Namespaces for Organization

# Create namespaces for different environments
kubectl create namespace development
kubectl create namespace staging
kubectl create namespace production

3. Set Resource Limits

resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"

4. Use Health Checks

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

5. Use Labels and Selectors

metadata:
  labels:
    app: web-app
    version: v1
    environment: production
    team: backend

6. Use ConfigMaps and Secrets

# Store configuration externally
kubectl create configmap app-config --from-literal=DB_HOST=postgres
kubectl create secret generic db-secret --from-literal=DB_PASSWORD=secret123

🚀 10. Next Steps

Advanced Concepts to Learn

StatefulSets: For stateful applications (databases)
DaemonSets: For node-level services (monitoring agents)
Horizontal Pod Autoscaler (HPA): Automatic scaling
Vertical Pod Autoscaler (VPA): Resource optimization
Pod Disruption Budgets: Availability guarantees
Pod Security Standards: Security policies

Tools to Master

Helm: Package manager for Kubernetes
Kustomize: Configuration management
ArgoCD: GitOps deployment
Prometheus & Grafana: Monitoring
Fluentd/Elasticsearch: Logging

Practice Projects

Simple Web App: Deploy nginx with database
API Service: Deploy REST API with authentication
Batch Job: Create data processing pipeline
Monitoring Stack: Deploy Prometheus + Grafana
CI/CD Pipeline: Automate deployments

🆘 Troubleshooting Quick Reference

Common Issues & Solutions

Issue	Command	What to Check
Pod not starting	`kubectl describe pod <name>`	Events, resource limits
Service not working	`kubectl get endpoints <service>`	Pod labels, service selector
Deployment stuck	`kubectl rollout status deployment/<name>`	Image pull, resource limits
Ingress not working	`kubectl describe ingress <name>`	Ingress controller, TLS
High resource usage	`kubectl top pods`	Resource limits, memory leaks

Useful Aliases

# Add to your .bashrc or .zshrc
alias k='kubectl'
alias kg='kubectl get'
alias kd='kubectl describe'
alias kl='kubectl logs'
alias ke='kubectl exec -it'
alias kp='kubectl port-forward'

Last Updated: September 3, 2025 Version: 1.0 Maintainer: Infrastructure Team

17 KiB Raw Blame History