Dublin, IRL
14 days ago
Network Production Engineer
**Summary:** The Network Infrastructure team is responsible for designing, building and operating one of the largest networks in the world. Networking is at the core of all Meta products and experiences, and we are looking for experienced Production Engineers who are interested in solving complex technical challenges in the Backbone or Datacenter Network domains.Production Network Engineers at Meta are hybrid software and network engineers who keep reliability and scalability in mind as they work on different parts of the lifecycle (designing, building, and operating our worldwide network). This role offers an opportunity to solve the scaling challenges of supporting billions of people using our family of apps; to cutting-edge challenges in AI workloads that power new Meta products.You will be joining the team that is responsible for the end-to-end health (performance and reliability) of Meta's backbone networks. You will build tools and use automation to efficiently scale how we mitigate real-time impact to the network, identify and investigate long-term trends into performance and risks in our backbone, and drive innovative solutions to monitor and improve Meta's current and future backbone network products.Our backbones continue to rapidly expand globally, driven most recently through the network demands that our AGI journey brings. We support both our "Classic Backbone", that transports traffic destined to people using Meta's products, and our "Express Backbone", that handles machine to machine traffic between our Data Centers.Engineers that typically thrive in this role are hybrid software and network engineers who are curious about how systems work, how they fail, and how we can increase their reliability. You have the opportunity to dig into interesting challenges in the networking and software domains, at a scale that offers new challenges on a daily basis. **Required Skills:** Network Production Engineer Responsibilities: 1. Conceive, develop, and deploy systems and tools to keep the network running reliably and efficiently. 2. Managing complex technical issues across networks, ranging from automated tooling to hardware failures and network issues. 3. Develop documentation, develop and review code, and debug the hardest problems, live, on some of the largest and most complex networks and systems in the world. 4. Participate in a weekly on-call rotation and be an escalation contact for service incidents. 5. Lead projects to address hard technical challenges, directly contributing to roadmaps and partner alongside the best engineers in the industry to develop reliable and scalable network and software solutions for our global data center fleet. 6. Proactively find gaps that impact multiple teams, come up with the execution plan, and drive the project directly and through influence of other teams. 7. Contribute to the overall team growth and development through peer mentorship. 8. Collaborate effectively with team members and partners in other global regions (EMEA), including flexibility to work during global-friendly hours as needed 9. Global travel 10-15% of the time. **Minimum Qualifications:** Minimum Qualifications: 10. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience 11. 8+ years of relevant experience developing scalable and reliable systems and/or networks 12. Experience coding in higher-level languages (e.g., Python, C++, Go, etc.) 13. Experience in developing and understanding network device configuration for at least one vendor (Juniper, Cisco, Arista, Brocade, etc.) 14. Experience in configuration and maintenance of network devices and Network Management systems, or applications such as web servers, load balancers, relational databases, storage systems and messaging systems 15. Experience learning software, frameworks and APIs **Preferred Qualifications:** Preferred Qualifications: 16. BS or MS in Computer Science, Computer Engineering, or Network Engineering 17. 8+ years of experience in TCP/IP and IPv6 18. 10+ years experience designing and operating data center networks 19. 8+ years experience in one or more of BGP, MPLS, ISIS or similar routing protocols - knowledge in typical configurations and performance tuning 20. 8+ years experience coding in higher-level languages (e.g., Python, C++, Go, etc.) 21. Experience understanding network hardware and topology failures 22. Understanding of AI training workloads and demands they exert on networks 23. Experience leading and setting technical direction for a team of 6+ engineers **Industry:** Internet
Por favor confirme su dirección de correo electrónico: Send Email