Mid Level-Site Reliability Engineer
Insight Global
Job Description
Insight global is seeking a Mid-Level Site Reliability Engineer (SRE) to join the core working middleware team of one of their top clients. The SRE will play a crucial role in maintaining the reliability, availability, and performance of our clients critical systems and services. This position involves collaborating with development, operations, and product teams to build and maintain scalable and resilient infrastructure. This SRE will support the administration of (Azure) AKS clusters running critical always-on middleware handling thousands of TPS. They will be expected to conduct operations in a manner consistent with a five-9s availability target. The SRE will be responsible for Azure AKS cluster deployments and cutovers, base image updates, testing IaC changes, and other work focused on daily operations. They must be able to apply software engineering best practices to IT operations tasks to maintain a scalable and reliable production environment. This engineer will be comfortable writing IaC as well as code to automate processes, such as creating monitoring queries, analyzing logs, disaster recovery tests, responding to incidents, and code as documentation. Proficiency in scripting and automation (e.g., Python, Bash, Ansible) cannot be understated.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form (https://airtable.com/app21VjYyxLDIX0ez/shrOg4IQS1J6dRiMo) . The EEOC "Know Your Rights" Poster is available here (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf) .
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .
Skills and Requirements
-3-5 years of experience as a Site Reliability Engineer or similar role
-Strong knowledge of cloud platforms; Azure preferred
-Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
-Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack); ELK stack preferred
-Experience in scripting and automation (Python, Bash, Ansible)
-Bachelors degree in Computer Science, Engineering, or a related field -Familiarity with VMware Tanzu
-Familiarity with GitHub actions (for CI/CD deployment orchestration)
-Background in financial services or experience working in a regulated environment null
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to HR@insightglobal.com.
Por favor confirme su dirección de correo electrónico: Send Email