San Francisco, CA
42 days ago
Engineering Manager, Infrastructure (Compute or Traffic)
About the Traffic Team

The Traffic team owns the critical path between our customers and our services. We’re responsible for routing, ingress/egress, service discovery, and mesh infrastructure that enables secure, reliable, and observable communication across our systems. At our scale, milliseconds matter and resiliency is non-negotiable.

We build and operate foundational components like our Envoy-based service mesh, global load balancing, zero-trust networking, and network observability systems. We’re also driving strategic initiatives around multi-region routing, intelligent failover, and edge-to-mesh traffic optimization.

The challenges are complex, often cross-cutting, and require a mindset that blends systems thinking with pragmatic execution. If you enjoy working at the intersection of performance, reliability, and distributed systems and you like proving it with data then you’ll thrive here.

About the Compute Team

The Compute team powers everything that runs inside our platform. We provide the infrastructure and automation that enables engineers to build, ship, and operate services at scale with confidence. Our work touches everything from container orchestration and scheduling to provisioning, runtime optimization, and developer experience.

We manage our Kubernetes-based platform, CI/CD pipelines, and core abstractions that simplify complex infrastructure for internal teams. We’re focused on leverage, building self-service, scalable, and secure systems that help product teams move fast without sacrificing reliability or cost efficiency.

This is a team where foundational engineering meets platform strategy. We’re deeply invested in automation, observability, and doing infrastructure the “boring” way by making it robust, reliable, and invisible when it needs to be.

About the Role

We’re hiring an Engineering Manager to lead one of our core infrastructure teams, Traffic or Compute. In this role, you’ll drive the technical vision and execution for building highly available, cost-effective, and performant infrastructure. You’ll partner with senior engineers and cross-functional stakeholders to design and operate systems that power our platform's reliability and scale. Whether it’s service mesh, Kubernetes orchestration, or cloud integration, your team will be at the heart of solving some of our toughest engineering problems. We work with technologies like AWS EKS, Python, Terraform, and Buildkite, but we’re just as excited about what we haven’t discovered yet. This is an opportunity to lead a team at the frontier of cloud infrastructure, with room to experiment and innovate. 

You’re excited about this opportunity because you will… Lead and mentor a team of experienced engineers, fostering a culture of excellence, inclusion, and ownership Champion data-informed decision-making and continuous improvement across the team Oversee the design and operation of scalable, reliable infrastructure, while proactively managing down technical debt Collaborate with product and engineering leadership to drive long-term architectural direction and influence company-wide strategy Guide the team through ambiguity, leveraging industry best practices while also knowing when to break the mold Communicate with clarity, whether explaining a trade-off to execs or mapping technical strategy across teams Drive alignment across teams and disciplines, leading high-impact cross-functional initiatives Stay current on the latest trends in cloud infrastructure, and bring back ideas worth pursuing Build, sustain, and grow a diverse team to address the growing needs of the organization We’re excited about you because you have… 7+ years of hands-on software development, including experience at scale with Kubernetes or service mesh with 3+ years designing and operating cloud-native systems in AWS, Azure, or GCP 2+ years leading engineering teams Experience managing infrastructure teams with high operational standards and a culture of innovation Built or scaled traffic systems (L7/L4 routing, load balancing, traffic policy) or you’ve led compute platform teams working with scheduling, orchestration, or provisioning with Kubernetes at scale A strong bias for action, pragmatic problem-solving, and technical depth Demonstrated success leading cross-functional initiatives with measurable impact

Notice to Applicants for Jobs Located in NYC or Remote Jobs Associated With Office in NYC Only

We use Covey as part of our hiring and/or promotional process for jobs in NYC and certain features may qualify it as an AEDT in NYC. As part of the hiring and/or promotion process, we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound from August 21, 2023, through December 21, 2023, and resumed using Covey Scout for Inbound again on June 29, 2024.

The Covey tool has been reviewed by an independent auditor. Results of the audit may be viewed here: Covey

Por favor confirme su dirección de correo electrónico: Send Email