At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.
About the Tech@Lilly Organization:
Tech@Lilly builds and maintains capabilities using cutting edge technologies like most prominent tech companies. What differentiates Tech@Lilly is that we create new possibilities through tech to advance our purpose – creating medicines that make life better for people around the world, like data driven drug discovery and connected clinical trials. We hire the best technology professionals from a variety of backgrounds, so they can bring an assortment of knowledge, skills, and diverse thinking to deliver innovative solutions in every area of the enterprise.
About the Business Function:
Tech@Lilly Business Units is a global organization strategically positioned so that through information and technology leadership and solutions, we create meaningful connections and remarkable experiences, so people feel genuinely cared for. The Business Unit IDS organization is accountable for designing, developing, and supporting commercial or customer engagement services and capabilities that span multiple Business Units (Bio-Medicines, Diabetes, Oncology, International), functions, geographies, and digital channels. The areas supported by Business Unit IDS includes: Customer Operations, Marketing and Commercial Operations, Medical Affairs, Market Research, Pricing, Reimbursement and Access, Customer Support Programs, Digital Production and Distribution, Global Patient Outcomes, and Real-World Evidence.
Job Title: Data Engineer – Operations
The Data Engineer – Operations is responsible for the reliability, performance, and support of data pipelines, platforms, and workflows across the enterprise data ecosystem. With 2 - 12 years of experience, this role ensures smooth daily operations, incident resolution, proactive monitoring, and support of data integration and processing activities. The ideal candidate has strong experience in managing cloud-based data platforms (e.g., AWS, Databricks, Azure), production workflows, CI/CD pipelines, and data monitoring tools. This role partners closely with data engineers, analysts, platform teams, and business users to ensure SLA adherence and platform health. The position is critical to scaling data operations as organizations shift from vendor-led to in-house models.
“This role is open across experience levels, and the final designation will be determined based on the interview and assessment outcomes."
What you’ll be doing:
Monitor and manage day-to-day operations of data pipelines, ETL jobs, and cloud-native data platforms (e.g., AWS, Databricks, Redshift).
Own incident response and resolution, including root cause analysis and post-mortem reporting for data failures and performance issues.
Perform regular system health checks, capacity planning, and cost optimization across operational environments.
Maintain and enhance logging, alerting, and monitoring frameworks using tools like CloudWatch, Datadog, Prometheus, etc.
Collaborate with development teams to operationalize new data workflows, including CI/CD deployment, scheduling, and support documentation.
Ensure data quality by executing validation checks, recon processes, and business rule compliance.
Work with vendors (if applicable) and internal teams to support migrations, upgrades, and production releases.
How You Will Succeed:
Automation and Self-Service Focus
Identify repetitive operational tasks and implement automation using Python, Airflow, Jenkins, or similar tools.
Enable self-service capabilities and alerting for platform users and stakeholders.
AI-Ready Operations Mindset
Explore and propose how AI can be used to detect anomalies, predict issues, and accelerate root cause analysis.
Collaborate with internal teams to experiment with LLMs, bots, or ML models for improving operational efficiency.
Stay informed on emerging AIOps tools and work toward integrating them gradually.
Continuous Optimization
Monitor pipeline performance and costs, and implement changes that optimize compute, memory, and storage usage.
Recommend and trial AI/ML-based approaches for pipeline tuning, scheduling, or resource allocation.
Cross-Team Collaboration
Work with data engineers, analysts, and product owners to ensure seamless data availability and usability.
Communicate incidents and resolutions clearly and proactively across teams
What You should Bring:
Strong background in managing and maintaining data pipelines, preferably in AWS or Azure environments.
Proficiency in SQL, Python, or PySpark for operational debugging and performance tuning.
Hands-on experience with monitoring tools (e.g., CloudWatch, Datadog) and orchestration frameworks like Airflow.
Familiarity with CI/CD processes and code deployment practices using GitHub or similar tools.
Awareness of data governance, privacy, and security protocols.
Proactive problem-solving mindset with the ability to identify patterns in recurring issues.
Exposure to AI/ML concepts or a passion for learning and applying automation through AI frameworks.
Basic Qualifications and Experience Requirement:
Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related field.
2 - 12 years of experience in Data Engineering, DataOps, Platform Reliability, or equivalent roles.
Hands-on experience managing pipelines on Databricks, AWS Glue, EMR, Snowflake, or similar platforms.
Strong scripting skills (Python, Bash) and familiarity with version control (Git).
Experience with orchestration tools like Apache Airflow, AWS Step Functions, or similar.
Exposure to monitoring/observability tools like CloudWatch, Datadog, Grafana, Prometheus, etc.
Solid understanding of data lifecycle, job dependencies, and data validation techniques.
Eagerness to learn and apply AI/ML approaches in operational workflows.
Additional Skills/Preferences:
Domain experience in healthcare, pharmaceutical ( Customer Master, Product Master, Alignment Master, Activity, Consent etc. ), or regulated industries is a plus.
AWS, Google Cloud, or Databricks Certified Data Engineer (Associate/Professional)
ITIL® Foundation or SRE Foundation certified
AIOps Foundation or equivalent certification
Certification in Apache Airflow or any orchestration platform (e.g., Prefect)
Additional Information:
N/A
Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.
Lilly does not discriminate on the basis of age, race, color, religion, gender, sexual orientation, gender identity, gender expression, national origin, protected veteran status, disability or any other legally protected status.
#WeAreLilly