Job Description:
The Cloud Operations Lead is responsible for preventing major incidents, driving recovery during crisis and optimizing infrastructure support processes. They oversee the day-to-day operations of Mars' cloud and on-prem infrastructure globally, ensuring stability, performance, and security through Mars ITSM processes and implement best practices. By prioritizing preventative measures and partnering with Managed Services, they ensure robust and solid operations. They lead the strategic planning and execution of all cloud operations delivered by the CCoE (Cloud Center of Excellence) third-parties.
What are we looking for?
Bachelor of sciences degree or equivalent in engineering filed/IT/Computer Science/Management.
5+ Years' experience within IT (Application, Platform or Infrastructure).
Excellent knowledge of computing, storing and network architecture.
Expertise in managing cloud-based infrastructure for public Cloud: Azure, AWS, GCP.
Experience in security management.
Problem-solving skills with the ability to troubleshoot complex Infrastructure issues.
Proven leadership and influencing experience through interaction with key stakeholders and cross-functional teams.
Desirable IT certifications related to on-prem infra, Cloud providers and ITIL.
What will be your key responsibilities?
Cloud Infrastructure Management:
Oversee the health and performance of cloud infrastructure, including servers, storage, networks, and databases across Azure, AWS, GCP and on-premises.
Major Incident, Crisis & Escalation Management:
Accountable to lead infrastructure component of major incidents and crisis within his/her time zone, provide timely communication / status reports and utilize follow-the-sun model for continuous CCoE presence and ownership. Conducting retrospectives and internal PIRs (Post Incident Review) to gather understanding, opportunities to improve and lessons learnt to actively prevent reoccurrence and address root cause.
Performance Monitoring:
Implement and managing observability services to track cloud resource utilization, application performance, and identify potential issues proactively.
ITSM Processes Management:
Lead the response to cloud-related ITSM processes (major incident, regular incident, request, change, problem, CMDB, DR drill).
Approvals:
Act as approval authority for key activities such as changes, elevated privileges, and Cloud requests aiming simplification and preventing disruptions.
Capacity Planning:
Forecast future cloud resource needs and proactively scaling infrastructure and support teams to meet demand.
Automation:
Recommend automation and scripts to streamline repetitive cloud operations like delivery, provisioning, patching, and backups.
Security Management:
Ensure compliance with security best practices by managing access controls, monitoring threats, and implementing security configurations on cloud platforms.
Third-party Service Management:
Manage third-party teams of Cloud engineers and analysts through Service Review meetings, provide technical guidance and decision In partnership with Cloud Tech Owners, and foster vendor autonomy when needed.
What can you expect from Mars?
Work with diverse and talented Associates, all guided by the Five Principles.
Join a purpose driven company, where we’re striving to build the world we want tomorrow, today.
Best-in-class learning and development support from day one, including access to our in-house Mars University.
An industry competitive salary and benefits package, including company bonus.
#TBDDT
Mars is an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law. If you need assistance or an accommodation during the application process because of a disability, it is available upon request. The company is pleased to provide such assistance, and no applicant will be penalized as a result of such a request.