Atlanta, GA, US
1 day ago
Senior Software Engineer - Site Reliability Engineer (Remote)

Position Purpose:

The Senior Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, and performance of our systems and applications. As a Senior SRE, you will work closely with a team of engineers to build and maintain reliable infrastructure and systems and to ensure the highest level of availability and operational excellence on production systems and services. You will also assist in tool selection, configuration, security, resilience, performance tuning, destructive testing and production monitoring and support.

Senior SREs contribute to foundational infrastructure elements and system-related documentation. You will play a key role in the reliability organization and are expected to mentor and support junior engineers.


Key Responsibilities:

50% Delivery and Execution - Develops, tests, deploys, and maintains software, with a clear understanding of the value the software is to provide; Takes on new opportunities and tough challenges with a sense of urgency, high energy and enthusiasm; Consistently achieves results, even under tough circumstances; Develops test suites (functional, destructive, etc) to enable success, rapid deployment of code to production; Takes a broad view when approaching issues; using a global lens20% Learns and Grows - Learns through successful and failed experiment when tackling new problems; Actively seeks ways to grow and be challenged using both formal and informal development channels20% Plans and Aligns - Collaborates with other team members in agile processes; Creates new and better ways for the organization to be successful; Works the Product Team to ensure user stories are valuable, developer ready, easy to understand and testable; Delivers multi-mode communications that convey a clear understanding of the unique needs of different audiences; Adapts approach and demeanor in real time to match the shifting demands of different situations; Relates openly and comfortably with diverse groups of people10% Supports and Enables - Helps grow junior engineers by providing guidance on modern software development frameworks, and leading technical discussions


Direct Manager/Direct Reports:

This position typically reports to Software Engineer Manager or Sr. ManagerThis position has 0 Direct Reports


Travel Requirements:

No travel required.


Physical Requirements:

Most of the time is spent sitting in a comfortable position and there is frequent opportunity to move about. On rare occasions there may be a need to move or lift light articles.


Working Conditions:

Located in a comfortable indoor area. Any unpleasant conditions would be infrequent and not objectionable.


Minimum Qualifications:

Must be eighteen years of age or older.Must be legally permitted to work in the United States.


Preferred Qualifications:

3-5 years of relevant work experience in a related engineering field (Systems, Software, Operational) or a reliability engineering domainDeep Understanding of and Extensive Experience with ITIL processes and the Support and Maintenance of Production Systems and Services including Change, Incident, and Problem ManagementExtensive experience with common scripting and programming languages (BASH, Python, Golang, Typescript, Java, et al) as well as data serialization and configuration DSLs (TCL, YAML, JSON, et al)Extensive experience with infrastructure automation tools such as Terraform and Ansible.Extensive experience managing Google Cloud Platform projects and services including infrastructure, Compute, Developer Tools, Security and Identity.Experience with monitoring and observability tools like Prometheus, Grafana, and OpenTelemetryFamiliarity with both Unix and Windows operating systems.Experience with security frameworks for user and services authorization and authentication.Experience in creating and executing destructive, and performance tests.Experience with modern debugging and root cause analysis techniques.Experience with version control systems.Deep understanding of SLOs and core SRE principles and practices.Operational support experience with a focus on system reliability.Extensive experience taking a lead role in managing live production incidentsExtensive experience in problem management including stakeholder reportingAbility to share knowledge across engineering functions.Strong communication and collaboration skills with experience producing operational status communications, providing real-time reporting to diverse stakeholders, writing documentation, providing peer tutelage, providing consultative services, and presenting technical solutions and training to both technical and non-technical audiences.


Minimum Education:

The knowledge, skills and abilities typically acquired through the completion of a bachelor's degree program or equivalent degree in a field of study related to the job.


Preferred Education:

No additional education


Minimum Years of Work Experience:

3


Preferred Years of Work Experience:

No additional years of experience


Minimum Leadership Experience:

None


Preferred Leadership Experience:

None


Certifications:

None


Competencies:

Global PerspectiveManages AmbiguityNimble LearningSelf-DevelopmentCollaboratesCultivates InnovationSituational AdaptabilityCommunicates EffectivelyDrives ResultsInterpersonal Savvy

Benefits offered include health care benefits, 401K, ESPP, paid time off, and success sharing bonus.  For a full list of the various benefits The Home Depot offers, visit https://careers.homedepot.com/our-benefits.

Por favor confirme su dirección de correo electrónico: Send Email