Site Reliability Engineer II
IBM
**Introduction**
A career in IBM Software means you'll be part of a team that transforms our customer's challenges into industry-leading solutions. We are an infinitely curious team, always seeking new possibilities, and dedicated to creating the world's leading AI-powered, cloud-native software solutions. Our renowned legacy creates endless global opportunities for our network of IBMers. We are a team of deep product experts, ensuring exceptional client experiences, with a focus on delivery, excellence, and obsession over customer outcomes. This position involves contributing to HashiCorp's offerings, now part of IBM, which empower organizations to automate and secure multi-cloud and hybrid environments. You will join a team managing the lifecycle of infrastructure and security, enhancing IBM's cloud solutions to ensure enterprises achieve efficiency, security, and scalability in their cloud journey.
**Your role and responsibilities**
HashiCorp Boundary aims to provide a seamless, just-in-time remote access experience for customers to their infrastructure and other web applications without having to worry about passwords, certificates or other credentials. Boundary is offered as a Cloud platform, and this role will be part of the Boundary Enterprise Enablement team whose primary focus will be scale and reliability to enable hypergrowth among medium and large enterprises.
What you’ll do (responsibilities)
As an engineer on the Boundary Product Reliability team,you will:
Develop a deep understanding on how customers use Boundary Cloud and enhance their experience through reliability
Drive service reliability by developing tooling that enables metric visibility using SLIs, SLOs, and SLAs
Champion incident management processes that directly impact customer experience
Reduce the operational overhead of HashiCorp Boundary product and leverage data to understand the largest source of reliability risk
Deploy, manage, monitor a large-scale Boundary Cloud
Predict our future failures and work proactively to mitigate them
Have a passion for developer productivity to make other engineers lives better
Empowering engineers to troubleshoot their own issues by developing tools, frameworks and guardrails for safety
Advocate and implement reliable design patterns (circuit breakers, graceful degradation, Zero-Downtime Upgrades etc.)
Partner with the broader HashiCorp organization to learn from incidents through a blameless postmortem process
Collaborate across teams to improve our tools based on experiences found from running our own software in production
Participate in a 24/7 on-call rotation that supports our production services
This job can be performed from anywhere in the US
**Required technical and professional expertise**
* 5+ years of handling production applications at scale: Backend applications written in Golang, Databases, Observability, and AWS Primitives
* Strive for quality through maintainable code and comprehensive testing from development to deployment
* Clear communication skills while remaining empathetic and kind
* An eagerness to learn through humility and reflection
* Experience debugging performance bottlenecks for live services and database systems
* Led or participated in incidents through incident management tools like incident.io, PagerDuty, etc
**Preferred technical and professional experience**
* Working knowledge of industry best practices related to information security
* Working knowledge on AWS Aurora or postgres, Nomad or other orchestration platforms, Traefik or other load balancing technologies
* Experience or willingness to conceive, document and advocate for best practices
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Por favor confirme su dirección de correo electrónico: Send Email