As a Principal Software Development Engineer, you will own the software design and development for major components of Oracle's Cloud Infrastructure. You should be both a rock-solid coder and a distributed systems generalist, able to dive deep into any part of the stack and low-level systems and design broad distributed system interactions. You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
This role sits within the Host Provisioning Services (HoPS) team, which owns the critical infrastructure responsible for automating the full server lifecycle from rack integration and hardware bring-up to customer-ready instance provisioning and firmware management.
HoPS services operate at the intersection of bare metal hardware and full-stack orchestration frameworks. They interface directly with components like BMCs, NICs, SmartNICs, ILOMs, GPUs, and custom firmware stacks. The team builds microservices and tooling that provision, configure, secure, and validate server platforms across OCI’s global fleet.
As a Principal Engineer, you will architect and deliver highly available services and automation pipelines that manage server provisioning at hyperscale, enable firmware pinning for deterministic customer environments, and deliver fleet-wide firmware updates and telemetry-based observability. You’ll drive solutions to support new silicon (e.g., NVIDIA, AMD, Intel platforms), SmartNIC/HostNIC convergence, RoT security integration, and the evolution of OCI’s infrastructure into next-gen clusters and composable hardware environments.
You will partner closely with teams across Compute, Networking, Security, Datacenter Engineering, and Hardware Development to ensure OCI can launch, scale, and maintain new server platforms with minimal operational overhead and high reliability.
This role is ideal for experienced systems engineers with a deep understanding of operating systems, hardware-software integration, distributed services, and cloud-scale automation.