Suraj B. Zinjad

Senior Site Reliability / DevOps Engineer

profile image

Senior SRE/DevOps Engineer with 11+ years of experience designing, automating, and operating cloud-native platforms on AWS & Azure. Specialized in Kubernetes platform engineering, GitOps, Infrastructure as Code, observability, and cost/security governance. Proven track record of zero-downtime migrations, multi-region deployments, and accelerating delivery through standardized tooling and automation.


Work Experience

Sr. Senior Site Reliability Engineer

VMware Inc. | Oct 2020 – Present
  • Managed large-scale AWS & Kubernetes (EKS) platforms, ensuring secure, highly available, and scalable environments.
  • Automated provisioning with Terraform (100+ modules) across AWS (EC2, ECS, RDS/Aurora, DynamoDB, ElastiCache, Kinesis, Kafka, Elasticsearch, Airflow).
  • Built a Kubernetes automation framework cutting cluster setup from days to hours; standardized add-ons (Istio, Prometheus stack, ExternalDNS, Karpenter).
  • Developed custom Operators (Kubebuilder/Operator SDK) for Flink, Kafka, and monitoring; implemented CRDs, reconciliation, finalizers.
  • Implemented GitOps with Argo CD (multi-region, ApplicationSets, RBAC, SSO, secret management) improving release velocity and governance.
  • Scaled observability with Prometheus Operator, Thanos, Grafana, KEDA; centralized logs via Fluentd/Filebeat → Logz.io; integrated Wavefront & AIOps.
  • Enforced security/compliance using OPA Gatekeeper (Rego) and IAM Roles for Service Accounts; optimized costs with Kubecost.
  • Optimized autoscaling by migrating to Karpenter, reducing node scale-up time to <20s; supported 24x7 on-call and zero-downtime migrations.

Authorised Officer

UBS Group AG | Feb 2020 – Oct 2020
  • Built & maintained AKS clusters using Terraform and custom modules.
  • Developed Helm charts and deployed microservices via Azure DevOps pipelines; image builds with TeamCity and Nexus.
  • Integrated Azure services: Key Vault, Cosmos DB, PostgreSQL, Storage, PIM.
  • Implemented monitoring & performance testing with Azure Application Insights (targeting 99.99% availability).
  • Automated release notes, backup/restore, and PR checks with Bash/Python.
  • Migrated pipelines from Azure DevOps to UBS Deploy; supported weekly enterprise release cycles.

Infra/DevOps Consultant

ThoughtWorks | Sep 2019 – Feb 2020
  • Managed Kubernetes clusters across Dev–Prod in hybrid infra (AWS + VMware/BareMetal).
  • Implemented CI/CD with GoCD and integrated SonarQube & Acunetix for quality and security.
  • Delivered microservices using custom Helm charts; stack: MongoDB, Kafka, Istio, Grafana, ELK.
  • Migrated applications to AWS EKS using CloudFormation & AWS CDK; built serverless components with Lambda.
  • Designed Jenkins pipelines (EC2 master, K8s agents) and partnered with client teams for release planning & automation.

Senior DevOps Engineer

Morningstar Inc. | Oct 2016 – Sep 2019
  • Managed CDNs (Akamai → CloudFront migration), domains/DNS with BIND, and internal Varnish clusters.
  • Administered F5 load balancers: traffic management, custom health checks (MySQL/Redis), and security policies.
  • Deployed & maintained on‑prem Apigee clusters (install, upgrade, proxy automation with Ansible).
  • Built & operated Kubernetes clusters (kOps & in‑house) with Istio, Prometheus, Grafana.
  • Delivered serverless data lake & analytics on AWS (S3, Lambda, Athena, CloudWatch, Jupyter, Hue).
  • Engineered caching/data layers with Redis Sentinel, Memcached, Gemfire/Geode for high‑performance apps.

DevOps Engineer

Network18 Group | Sep 2014 – Oct 2016
  • Managed CDNs (Akamai, Varnish) and DNS records with BIND.
  • Automated infra with Chef & Ansible (Foreman integration) and built CI/CD with Jenkins.
  • Deployed monitoring/logging stacks: Splunk, ELK, Graphite + Grafana, TrueSight with PagerDuty on‑call.
  • Administered Citrix & LVS/IPVS load balancers, custom health checks (Redis/MySQL).
  • Supported virtualization (KVM, XEN, VMware) and AWS DR; maintained app stacks (Apache, Nginx, MySQL, Redis, MongoDB, RabbitMQ).

Selected Highlights

  • Cut EKS cluster provisioning time from days → hours via standardized automation and add-on bootstrapping.
  • Migrated multiple workloads from monolith to microservices on Kubernetes with zero downtime.
  • Enabled external-metric autoscaling using KEDA and CloudWatch exporter; reduced operational toil with GitOps.