About this opportunity:
A&AI (SL IT & ADM) team is currently seeking a versatile and motivated DevOps Engineer (with expertise in Kubernetes and Cloud Infrastructure) to join the AI/ML team. This role will be pivotal in managing multiple platforms and systems, focusing on Kubernetes, ELK/Opensearch, and various DevOps tools to ensure seamless data flow for our machine learning and data science initiatives. The ideal candidate should have a strong foundation in Python programming, experience with Elasticsearch, Logstash, and Kibana (ELK), proficiency in MLOps, and expertise in machine learning model development and deployment. Additionally, familiarity with basic Spark concepts and visualization tools like Grafana and Kibana is desirable.
What you will do:
Design and implement robust AI/ML infrastructure using cloud services and Kubernetes to support machine learning operations (MLOps) and data processing workflows.
Deploy, manage, and optimize Kubernetes clusters specifically tailored for AI/ML workloads, ensuring optimal resource allocation and scalability across different network configurations.
Develop and maintain CI/CD pipelines tailored for continuous training and deployment of machine learning models, integrating tools like Kubeflow, MLflow, ArgoFlow or TensorFlow Extended (TFX).
Collaborate with data scientists to oversee the deployment of machine learning models and set up monitoring systems to track their performance and health in production.
Design and implement data pipelines for large-scale data ingestion, processing, and analytics essential for machine learning models, utilizing distributed storage and processing technologies such as Hadoop, Spark, and Kafka.
.
The skills you bring:
Extensive experience with Kubernetes and cloud services (AWS, Azure, GCP, private cloud) with a focus on deploying and managing AI/ML environments.
Strong proficiency in scripting and automation using languages like Python, Bash, or Perl.
Experience with AI/ML tools and frameworks (TensorFlow, PyTorch, Scikit-learn) and MLOps tools (Kubeflow, MLflow, TFX).
In-depth knowledge of data pipeline and workflow management tools, distributed data processing (Hadoop, Spark), and messaging systems (Kafka, RabbitMQ).
Expertise in implementing CI/CD pipelines, infrastructure as code (IaC), and configuration management tools.
Familiarity with security standards and data protection regulations relevant to AI/ML projects.
Proven ability to design and maintain reliable and scalable infrastructure tailored for AI/ML workloads.
Excellent analytical, problem-solving, and communication skills.
Why join Ericsson?
At Ericsson, you´ll have an outstanding opportunity. The chance to use your skills and imagination to push the boundaries of what´s possible. To build solutions never seen before to some of the world’s toughest problems. You´ll be challenged, but you won’t be alone. You´ll be joining a team of diverse innovators, all driven to go beyond the status quo to craft what comes next.
What happens once you apply?
Click Here to find all you need to know about what our typical hiring process looks like.
Encouraging a diverse and inclusive organization is core to our values at Ericsson, that's why we champion it in everything we do. We truly believe that by collaborating with people with different experiences we drive innovation, which is essential for our future growth. We encourage people from all backgrounds to apply and realize their full potential as part of our Ericsson team. Ericsson is proud to be an Equal Opportunity Employer. learn more.
Primary country and city: India (IN) || Bangalore
Req ID: 766746