As a Team Lead of Site Reliability Engineering you will manage and oversee the engineering teams supporting a large-scale distributed application portfolio across on-prem and Cloud environments. With focus to increase efficiency, eliminate downtime, optimize cost, and maintain performance at scale, you will provide leadership to our performance management, application security & Reliability processes, while managing the health of core E-Commerce systems, site performance and reliability solutions.
Essential Duties and Responsibilities Manages end-to-end availability and reliability of E-commerce services, systems, platforms, and infrastructure and ensure they are designed and operated in an optimal manner Maintains security and performance of mission-critical applications and services that are part of the E-Commerce ecosystem Partners with Information Security with managing application security, vulnerabilities fix remediation, and site compliance Partners with Cloud and Infrastructure teams to build and maintain environments, optimize usage and cost with optimal scaling strategy Manages the performance strategy, test executions and remediation of critical site findings Establishes application and synthetic monitoring, alerting and execution of failover capabilities and automated self-healing and recovery. Ensures day-to-day support for multiple environments, ensuring readiness for project development and test activities Employs strong site reliability principles and practices, and continuous improvement of processes via automation. Partner with internal & external teams & ensure all change & release activities reviewed for trouble-free roll out & reduce risk. Owns day-to-day health, uptime, monitoring and reliability of the website and related services Lead continuous improvement that create an operating environment that includes dynamically monitoring, alerting, Failover capabilities and automated self-healing and recovery. Participate & Maintain 24x7 on call rotations for Site Reliability. May perform other duties as assigned *Required Qualifications
Experience: 9+ years’ experience around performance engineering, application monitoring & security for an organization with large and complex information systems is preferred. 6+ years’ experience in B2B or B2C customer facing software design, development. 3+ years’ experience in cloud PaaS/IaaS environments (Azure, GCP), release management, vulnerability management and automation.
Education: Bachelor’s degree in Computer Science or related field is required. Any suitable combination of education and experience will be considered.
Working Conditions Hybrid / Flexible working conditions Must be able to work some nights and weekends Occasional travel required Physical Requirements Sitting Standing (not walking) Walking Kneeling/Stooping/Bending Reaching overhead Lifting up to 20 pounds Disclaimer
This job description represents an overview of the responsibilities for the above referenced position. It is not intended to represent a comprehensive list of responsibilities. A team member should perform all duties as assigned by his/ her supervisor.
Company Info