Platform Site Reliability Engineer - Azcapotzalco, Ciudad de México, Mexico

Azcapotzalco, Ciudad de México, Mexico

1 day ago

Platform Site Reliability Engineer

HSBC

If you’re looking for a career where you can make a real impression, join Global Service Center (GSC) HSBC and discover how valued you’ll be. HSBC is one of the largest banking and financial services organisations in the world, with operations in 64 countries and territories. We aim to be where the growth is, enabling businesses to thrive and economies to prosper, and, ultimately, helping people to fulfil their hopes and realise their ambitions.

We are currently seeking an experienced professional to join our team in the role of Platform Site Reliability Engineer

Role Purpose:

This role sits within the Wealth and Personal Banking CTO Engineering Foundation group. We are driving innovation using Cloud technologies by designing, building, and operating mission critical shared service platforms hosting the APIs and Microservices which underpin the banks’ digital products and services.

We are currently seeking an experienced individual to join our team in the role of Platform Site Reliability Engineer.

Main Activities:

As an SRE you will have responsibility to:

Ensure the availability and maintainability of our large-scale API and Microservices platform located across three points of presence in HK, UK, and the US.Continuously improve the reliability, capacity, and performance of our platforms by applying SRE principles and practices to drive scale, enhance observability, reduce toil, more accurately measure risk, and more safely enable business driven change.Elevate our expertise and maturity in safely managing our core technology stack underpinned by AWS, Kubernetes, Kong API gateway, Istio Service Mesh, and a host of supporting services in a hybrid hosting environment (i.e., private/public cloud on-prem).Develop best in class observability tools and techniques enabling monitoring and alerting capability which facilitate not only incident detection and response, but also capacity management, improved release safety, and greater resource efficiency. Investigate, triage, and resolve production incidents and use data to articulate impact with relentless attention to the technical signals and underlying root causes that enable remediation and future avoidance/mitigation.Contribute to the design and engineering of auto and self-healing capability for known failure modes across our platforms.Contribute code to our platform repositories enabling not only our reliability agenda (e.g., monitoring-as-code), but also higher release speed and safety, simpler tenant onboarding, and improved controls. Author, contribute, and maintain our evolving knowledge base including support and operational runbooks, platform tenant guides, and onboarding and release documentation with an underlying goal of driving as much best practice and self-service as possible.Participate in regular SRE on-call rota supporting a 24/7/365 support model across our mission critical platforms within a large banking eco-system of front-end, middleware, and back-end fulfilment systems.To be a successful SRE in our team you will:Be fluent in written and spoken English and be comfortable working in a multi-cultural and diverse organization with team members across the globe.Value effective and continual communications, honesty, transparency, and accountabilityValue failure as an opportunity and an investment in more reliable systems (Blameless post-mortem culture).Possess fundamentals and evidence-based problem-solving skills; Drive decision-making by function, first principles-based mind-set. Demonstrate a bias-to-action and avoid analysis-paralysis, maintain a sense of ownership as you drive actions to the finish line with high quality and on time.Be ego-less when searching for the best ideas and contribute effectively outside of your specialty; You think about solving problems from the standpoint of best outcome for the team.Have strong fundamental knowledge in distributed systems and networking.Possess programming experience in at least one of the following languages: Python, Java, Go, Ruby, Bash scripting.Have the ability to debug and optimise code, while automating routine tasks (i.e., TOIL reduction)Have a strong background in the setup, use, and optimisation of a variety of observability tools including Splunk, DataDog, AppDynamics, and Cloudwatch.Understand the concepts of quantifying failure and availability in a prescriptive manner using SLOs, SLIs, and Error Budgets

We are currently seeking an experienced professional to join our team in the role of Platform Site Reliability Engineer

Role Purpose:

We are currently seeking an experienced individual to join our team in the role of Platform Site Reliability Engineer.

Main Activities:

As an SRE you will have responsibility to:

Competencies:

ProactiveStrong analytical thinking Effective CommunicationHigh attention to detailAdaptability to fast-paced, agile environments.

Due to the urgent hiring need, candidates with immediate right to work locally and no relocation need will be prioritised.

At HSBC we offer our colleagues a greater number of leave days so that they can fully enjoy their wedding, take care of the new member of the family, or grieve the loss of a family member. Our paid leave package is at the forefront in Mexico, now you have one more reason to be HSBC and proudly live a culture of well-being, balance, and care.

HSBC is an equal opportunity employer committed to building a culture where all employees are valued, respected and opinions count. We take pride in providing a workplace that fosters continuous professional development, flexible working and, opportunities to grow within an inclusive and diverse environment. We encourage applications from all suitably qualified persons irrespective of, but not limited to, their gender or genetic information, sexual orientation, ethnicity, religion, social status, medical care leave requirements, political affiliation, people with disabilities, color, national origin, veteran status, etc., We consider all applications based on merit and suitability to the role.

Personal data held by the Bank relating to employment applications will be used in accordance with our Privacy Statement, which is available on our website.

Issued By HSBC Electronic Data Process Mexico Private LTD

Production support across virtualized and/or containerised environments particularly those employing Kubernetes for workload management.Large scale API development and management technologies/frameworks like KongInfrastructure and application performance analysis and tuningService Mesh technology; particularly Istio Envoy and its variationsDevOps and Agile ways of workingCI/CD pipeline developmentInfrastructure-as-code tools (e.g., Terraform)•Relevant certifications for:Cloud Providers (e.g., AWS solutions architect associate, etc)Kubernetes (e.g., CKA)Ingles: B2 - C1

Competencies:

ProactiveStrong analytical thinking Effective CommunicationHigh attention to detailAdaptability to fast-paced, agile environments.

Due to the urgent hiring need, candidates with immediate right to work locally and no relocation need will be prioritised.

Personal data held by the Bank relating to employment applications will be used in accordance with our Privacy Statement, which is available on our website.

Issued By HSBC Electronic Data Process Mexico Private LTD

Mostrar mas

Save & Solicitar más tarde Applying Later... Click to ApplyI AppliedDidn't Apply

Por favor confirme su dirección de correo electrónico: Send Email

Aplicar para este empleo

---

74 HSBC empleos en 75 HSBC empleos en 237 HSBC empleos en