Casa Grande, AZ, USA
19 hours ago
Sr Site Reliability Engineer

The Site Reliability Engineer (SRE) is responsible for maintaining the reliability, performance, and scalability of Manufacturing Plant Systems (MES, LMIS). This role involves hands-on engineering work, automation, monitoring, and incident response to ensure uninterrupted manufacturing operations.

You Will: 

Implement and maintain monitoring tools (Elastic APM, Grafana) Develop dashboards and alerts for Plant Systems health and performance. Automate deployments, system checks, and recovery processes using tools like Ansible, Python, or PowerShell. Contribute to CI/CD pipelines for MES application updates. Participate in on-call rotations for MES-related incidents. Perform root cause analysis and contribute to blameless postmortems. Analyze MES systems logs and metrics to identify bottlenecks. Recommend and implement performance improvements. Work closely with MES developers, PFS engineers, and IT infrastructure teams. Document runbook, standard operating procedures, and system configurations. Ensure MES systems follow security best practices and compliance requirements. Support audits and vulnerability remediation efforts.

You Bring: 

Bachelor's Degree in Engineer, IT, Computer Science, Electronics or related STEM Degree. 5-8 years in IT Operations, SRE, MES support roles  Experience with MES platforms and manufacturing environments- Rockwell FTPC, Inductive Automation. Proficiency in scripting (Python, Bash, PowerShell) Experience with monitoring tools (Elastic APM, Grafana, Dynatrace), Automation (Ansible, Python), and CI/CD. Strong problem-solving and communication skills.
Por favor confirme su dirección de correo electrónico: Send Email