Yokneam, ISR
23 days ago
Senior Devops Engineer
Join our team as a Senior Devops Engineer . At NVIDIA, you'll be part of the team shaping the future of computing and guaranteeing the smooth operation of our brand-new technologies. Our mission is to leverage AI's power to build outstanding and pioneering solutions that have a significant impact on the world. What you'll be doing: + Own the solutions you build, collaborating with cross-functional teams to successfully implement them. + Collaborate with various teams in a fast-paced environment to ensure seamless project completion. + Continuously improve solution provisioning and management through automation. + Detect performance issues and recommend solutions to maintain world-class service quality. + Conduct capacity management and planning to meet ongoing operational needs. + Participate in incident reviews, assist in root cause identification, and write RCA reports. + Deliver SRE solutions in a globally distributed, multi-cloud hybrid environment - AWS, GCP, and On-prem. + Participate in the team's on-call rotation. What we need to see: + B.S. degree in Computer Science or related technical field (or equivalent experience) + 10+ years in building and supporting critical services and 5+ years of coding/scripting experience in at least two high-level programming languages such as Python, Go, Ruby, or Groovy. + Proficiency in Kubernetes administration, modern CI/CD techniques and Infrastructure as Code (IaC). + Deep understanding of Linux operating systems and TCP/IP fundamentals. + Expertise with at least one major cloud service provider - AWS, GCP, Azure. + Demonstrated proficiency with end-to-end SRE capabilities and observability. + Proficient in monitoring, metrics gathering, APM, container management, and log collection tools. + Creative problem solver with excellent debugging skills and great communication and documentation abilities. Ways to stand out from the crowd: + Linux certification from a well-known vendor - RedHat, Oracle, etc. + Prior experience managing large-scale Kubernetes deployment in production. + Strong skills in modern container networking and storage architecture. + Well-known Cloud Certification(s). + Hands-on experience working with Slurm/LSF environments.
Por favor confirme su dirección de correo electrónico: Send Email