The Oracle Cloud Infrastructure (OCI) Compute team is responsible for providing bare metal and virtual machines at scale to our customers; these include CPUs and GPUs. With rapid growth in machine learning, the demand for GPUs and CPUs is exploding, making performance and efficiency of cloud scale services a critical area of investment.
The Core Architecture team focuses on identifying performance and efficiency constraints within the entire lifecycle of compute services from server ingestion and inventory management, scalable data stores to API performance. Consulting engineers are responsible for performing deep analysis into business problems and proposing & incubating new solutions that address the needs of some of our largest customers.
You will take the lead in defining the architecture for the brand-new server lifecycle management capabilities that will power the next generation of the Compute Control Plane. This initiative spans across multiple Compute domains, from GPU validation to repairs, and you will drive engineers from these organizations to build cohesive microservice based solutions that will enable Compute to scale for growing customer demands.
We are looking for a hands-on senior engineer with technical breadth, proven experience in solving cloud scale problems, distributed systems design & implementation experience to build fault tolerant solutions that will form the foundations of the next generation of Compute offerings. The candidate is expected to have strong written and verbal communications skills, the ability to lead projects across organizational boundaries, and experience representing their work to senior leaders.
Career level-IC5