Reston, VA, USA
12 days ago
IT Engineer IV - Site Reliability Engineer
Kforce has a client that is seeking an IT Engineer IV - Site Reliability Engineer in Reston, VA. Duties Include: * Design and implement observability strategies using OpenTelemetry for distributed tracing, metrics, and logging * Instrument microservices written in Java and Python using Otel SDKs and auto-instrumentation tools * Develop and maintain Splunk dashboards, alerts, and reports to provide actionable insights into system performance and reliability * Collaborate with development and operations teams to ensure consistent and effective telemetry across services * Automate monitoring and alerting pipelines to proactively detect and resolve issues * Participate in on-call rotations, incident response, and postmortem analysis to improve system resilience * Drive adoption of SRE best practices including SLIs, SLOs and error budgets * Continuously evaluate and improve observability tools and practices
Por favor confirme su dirección de correo electrónico: Send Email