Tanmay Bhattacharya
Senior DevOps / Site Reliability Engineer
Gurugram, India • +91 99102 64837 • tanmay.bhattacharya@hey.com • linkedin.com/in/tanmay-bhattacharya • github.com/tanmaybh
Portfolio: www.eliorexa.com/portfolio/cloud-devops-engineer-mid
Professional Summary
Cloud / DevOps engineer with 5 years automating infrastructure and shipping pipelines across AWS and GCP. Owns Kubernetes platforms end to end — Terraform-managed clusters, GitOps delivery, and SLO-driven monitoring — with a record of measurable wins in uptime, deploy speed and cloud cost. Carries production on-call, mentors junior engineers, and treats every manual runbook as a bug to automate.
Skills
Cloud: AWS (EKS, ECS, RDS, Lambda, IAM), GCP (GKE, Cloud Run, BigQuery), Linux (RHEL, Ubuntu)
Containers & orchestration: Kubernetes, Docker, Helm, Argo CD, Istio
IaC & CI/CD: Terraform, Terragrunt, GitHub Actions, Jenkins, Ansible
Observability & data: Prometheus, Grafana, Datadog, Loki, PagerDuty, PostgreSQL
Core Competencies
AWS · GCP · Kubernetes · Docker · Terraform · CI/CD · GitHub Actions · Jenkins · Argo CD · monitoring · observability · IaC · SRE · on-call
Work Experience
Senior DevOps Engineer — Nimbus Retail (Series C e-commerce)
Aug 2022 – Present
Gurugram
- Own a 40-service Kubernetes platform on AWS EKS serving 800k daily users; standardized Terraform + Helm modules that cut new-service onboarding from ~2 days to 3 hours.
- Migrated CI/CD from Jenkins to GitHub Actions with Argo CD GitOps, dropping median deploy time from 28 minutes to 7 and raising deploy frequency 4x.
- Re-architected autoscaling and spot-instance usage, reducing monthly AWS spend ~₹18L/year (~32%) with no impact on p99 latency.
- Built an SLO + alerting program in Prometheus, Grafana and PagerDuty that lifted core-API availability from 99.7% to 99.95% and cut alert noise 45%.
- Mentored 2 junior engineers and authored the on-call runbooks and incident-review template now used across 3 teams.
DevOps Engineer — Helios Fintech
Jun 2020 – Jul 2022
Bengaluru
- Containerized a legacy Java monolith and split it into 6 Dockerized services on GCP GKE, improving release isolation and cutting rollback time to under 90 seconds.
- Codified all environments in Terraform with remote state and policy checks, eliminating config drift and reducing provisioning errors 70%.
- Implemented centralized logging and tracing (Loki + Grafana + OpenTelemetry), cutting mean time to resolution for incidents from ~50 minutes to 18.
Projects
DriftGuard — IaC policy gate — 5 repos
Open-source Terraform pre-merge policy checker adopted across 5 internal repos.
- Built an OPA-backed GitHub Actions check that blocks non-compliant infra changes, catching ~40 risky changes before merge in its first quarter.
Tech: Terraform, OPA, GitHub Actions, IaC
Interactive 3D Portfolio
WebGL portfolio built on Eliorexa, linked from this résumé.
- Reactive three.js hero with scroll-driven sections, lazy-loaded and reduced-motion safe — deployed via the same GitHub Actions pipeline pattern used at work.
Tech: three.js, React, CI/CD
Education
B.Tech Electronics & Communication Engineering
2020
Jadavpur University, Kolkata
First Class
Certifications
- Certified Kubernetes Administrator (CKA) — CNCF (2023)
- AWS Certified Solutions Architect – Associate — Amazon Web Services (2022)
- HashiCorp Certified: Terraform Associate — HashiCorp (2022)