Karnataka, India
7 days ago
Lead Site Reliability Engineer, ITC

WHO YOU’LL WORK WITH

SRE hired will work as an Reliability Engineer with the engineering teams. The candidate will belong to a horizontal domain called TechOps: Resilience Engineering. This position will provide a provision for the SRE to shift between multiple engineering platforms as demanded by the work, vision and/or criticality of the projects. Roles and responsibilities will include interacting with Engineering leaders, engineers, product teams, Scrum/Agile leads, Production support, business, and delivery teams.

“Just Do It” mindset teammates that believe in our shared commitment of listening with Empathy, Prioritize with Purpose, operate with a Growth Mind set and Foster Community & Trust.

This person will be reporting into Manager, Site Reliability Engineering and will be collaborating with teammates in various SRE functions across multiple geographies.

WHO WE ARE LOOKING FOR

We are looking for talented and passionate full stack developers with knowledge of datacenter infrastructure and cloud platforms.

Join us if you have willingness to learn new technologies, share knowledge and learn from others. You feel responsible for the success of the entire team. You are not afraid to work on challenging tasks if necessary and look for opportunities to help others, who may not be part of your team.

Ability to observe, diagnose, and develop fixes for production issues quickly and efficiently

Ability to develop and drive real-time monitoring solutions that provide visibility into site health and key performance indicators. Practical experiences with observability tools like SignalFx, CatchPoint, Splunk, NewRelic or any other industry-specific products will be a value add. Capability for customization on observability tools will be part of the job

Strong communication skills (written and verbal). They must be able to articulate issues and their impact(s)

Highly confident and capable of reporting and communicating high-value metrics to leadership. Deep understanding of the business landscape and how site reliability influences our consumers

Working understanding of IT service management (Incident, Problem, Change and Knowledge management)

Ability to work across teams (business and technical) to continuously analyze system performance in production, troubleshoot consumer reported issues, and proactively identify areas in need of optimization

Practical experience in managing and leading application reliability practices for consumer-facing web and mobile experiences

Demonstrated negotiation and influencing skills

Passion for coaching, teaching, mentoring and learning

WHAT YOU’LL WORK ON

SRE hired will work as a Lead Site Reliability Engineer with all engineering teams. The candidate will belong to a horizontal domain called TechOps: Resilience Engineering. This position will provide a provision for the SRE to shift between multiple engineering platforms as demanded by the work, vision and/or criticality of the projects. Roles and responsibilities will include interacting with Engineering leaders, engineers, product teams, Scrum/Agile leads, Production support, business, and delivery teams.

As a site reliability engineer, you will be focused on maximum availability, observability, reliability, security, and performance for Nike Digital Experiences.

SREs perform deep problem analysis, detect infrastructure or code defects, define, report, and create observability processes for Key Performance Indicators (KPIs), and work with product delivery teams to provide long term solutions to production issues.

Bachelor’s degree in computer science, Information Systems, Business, or other relevant subject areas

7+ years of professional experience in software development, operations, or support

Strong design and development experience with Java

Proficient with JavaScript on the frontend (React, Angular, etc.) and backend (Node.js) components

Kubernetes working knowledge and experience

Experience in other modern enterprise languages (functional or other – Scala, Python, Golang, etc.) is preferred

Basic understanding of DNS, Networking, Virtualization, Linux

Expertise in designing/building/supporting scalable cloud-based Micro Services

Experience with Docker and/or Serverless patterns

Experience with at least one No-SQL database like DynamoDb, Cassandra, etc.

Good understanding of RESTful APIs

Basic understanding of common tools for service management, agile, and observability: ServiceNow, Jira, Jenkins, Splunk, New Relic, SignalFx

Background with ITIL or Lean is a plus

Por favor confirme su dirección de correo electrónico: Send Email