Lead DevOps Engineer
🇧🇷 Brazil | 🇲🇽 Mexico | 🇦🇷 Argentina | 🇨🇱 Chile | 🇨🇴 Colombia
Management
AWS
Terraform
GitHub
Design
Amazon
Devops
Lead DevOps Engineer
from 🇧🇷 Brazil | 🇲🇽 Mexico | 🇦🇷 Argentina | 🇨🇱 Chile | 🇨🇴 Colombia
We are building aLead DevOps Engineer role to own and evolve the AWS platform behind a custom VDI solution and cloud playtesting/streaming services. You will drive infrastructure-as-code, ECS/EKS operations, AWS Lambda automation, and GitHub Actions CI/CD standards while optimizing GPU EC2 cost/performance and leading incident response across the platform. Apply now to help keep the platform reliable, efficient, and scalable
Responsibilities
- Design, build, and maintain AWS infrastructure with Terraform
- Manage Terraform workflows and remote state through HashiCorp Cloud Platform (HCP)
- Own the end-to-end infrastructure lifecycle, including provisioning, upgrades, decommissioning, and operational hygiene
- Operate ECS clusters to deploy and run microservices that support the platforms
- Administer EKS clusters that host and enable GitHub Actions runners, including necessary platform customizations
- Optimize and right-size GPU-enabled EC2 capacity to meet user experience goals under strict cloud cost controls
- Assess scaling behavior continuously, monitor utilization, and identify performance bottlenecks
- Implement and maintain AWS Lambda functions that automate cleanup tasks, on-demand provisioning, and operational workflows
- Standardize and improve GitHub Actions pipelines for Terraform plan/apply workflows, infrastructure releases, and container image build/publish/deploy processes
- Lead troubleshooting and service restoration for platform-wide degradations such as VDI session drops, authentication issues, and machine/storage failures
- Coordinate incident resolution across teams by driving investigation, mitigation, and follow-up actions
- Create and keep current run books, operational documentation, and onboarding materials
Requirements
- Proven 7+ years of experience in DevOps or platform engineering roles
- Deep expertise in AWS infrastructure architecture, provisioning, and full lifecycle management
- Hands-on proficiency with Terraform and HashiCorp Cloud Platform (HCP)
- Solid experience operating container orchestration using ECS and EKS
- Strong knowledge of GPU-enabled EC2 right-sizing, cloud cost management, and performance tuning
- Practical competency with AWS Lambda for event-driven automation
- Demonstrated background standardizing CI/CD using GitHub Actions pipelines
- Proven track record leading reliability engineering, troubleshooting, and incident resolution
- High ownership and accountability with the ability to work independently without close supervision
- Strong troubleshooting and systems thinking, staying calm and methodical during incidents
- Clear communication skills with both technical and non-technical stakeholders
- Effective prioritization in a Kanban workflow, balancing planned work with urgent interruptions
- English proficiency at B2 (Upper-Intermediate) level or higher
Nice to Have
- Familiarity with Amazon GameLift Streams
- Understanding of streaming and playtesting platform needs
- Ability to triage urgent ad-hoc requests that fall outside the standard Kanban flow






