Hyderabad, Telangana, India
2 days ago
Associate Manager AWS Site Reliability Engineer
Overview We are seeking a highly skilled and motivated Associate Manager AWS Site Reliability Engineer (SRE) to join our team. As an Associate Manager AWS SRE, you will play a critical role in designing, managing, and optimizing our cloud infrastructure to ensure high availability, reliability, and scalability of our services. You will collaborate with cross-functional teams to implement best practices, automate processes, and drive continuous improvements in our cloud environment Responsibilities Design and Implement Cloud Infrastructure: Architect, deploy, and maintain AWS infrastructure using Infrastructure-as-Code (IaC) tools such as Terraform or CloudFormation. Monitor and Optimize Performance: Develop and implement monitoring, alerting, and logging solutions to ensure the performance and reliability of our systems. Ensure High Availability: Design and implement strategies for achieving high availability and disaster recovery, including backup and failover mechanisms. Automate Processes: Automate repetitive tasks and processes to improve efficiency and reduce human error using tools such as AWS Lambda, Jenkins, and Ansible. Incident Response: Lead and participate in incident response activities, troubleshoot issues, and perform root cause analysis to prevent future occurrences. Security and Compliance: Implement and maintain security best practices and ensure compliance with industry standards and regulations. Collaborate with Development Teams: Work closely with software development teams to ensure smooth deployment and operation of applications in the cloud environment. Capacity Planning: Perform capacity planning and scalability assessments to ensure our infrastructure can handle growth and increased demand. Continuous Improvement: Drive continuous improvement initiatives by identifying and implementing new tools, technologies, and processes. Qualifications Experience: 10+ years of experience and Minimum of 5 years of experience in a Site Reliability Engineer (SRE) or DevOps role, with a focus on AWS cloud infrastructure. Technical Skills: Proficiency in AWS services such as EC2, S3, RDS, VPC, Lambda, CloudFormation, and CloudWatch. Automation Tools: Experience with Infrastructure-as-Code (IaC) tools such as Terraform or CloudFormation, and configuration management tools like Ansible or Chef. Scripting: Strong scripting skills in languages such as Python, Bash, or PowerShell. Monitoring and Logging: Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or CloudWatch. Problem-Solving: Excellent troubleshooting and problem-solving skills, with a proactive and analytical approach. Communication: Strong communication and collaboration skills, with the ability to work effectively in a team-oriented environment. Certifications: AWS certifications such as AWS Certified Solutions Architect, AWS Certified DevOps Engineer, or AWS Certified SysOps Administrator are highly desirable. Education: Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent work experience.
Por favor confirme su dirección de correo electrónico: Send Email