Pune, IND
2 days ago
Data Engineer - Senior
**DESCRIPTION** GPP Database Link (https://cummins365.sharepoint.com/sites/CS38534/) **Job Summary:** Leads projects for the design, development, and maintenance of a data and analytics platform. Effectively and efficiently processes, stores, and makes data available to analysts and other consumers. Works with key business stakeholders, IT experts, and subject-matter experts to plan, design, and deliver optimal analytics and data science solutions. Works on one or many product teams at a time. Though the role category is generally listed as Remote, this specific position is designated as Hybrid. **Key Responsibilities:** + **Business Alignment & Collaboration** – Partner with the Product Owner to align data solutions with strategic goals and business requirements. + **Data Pipeline Development & Management** – Design, develop, test, and deploy scalable data pipelines for efficient data transport into Cummins Digital Core (Azure DataLake, Snowflake) from various sources (ERP, CRM, relational, event-based, unstructured). + **Architecture & Standardization** – Ensure compliance with AAI Digital Core and AAI Solutions Architecture standards for data pipeline design and implementation. + **Automation & Optimization** – Design and automate distributed data ingestion and transformation systems, integrating ETL/ELT tools and scripting languages to ensure scalability, efficiency, and quality. + **Data Quality & Governance** – Implement data governance processes, including metadata management, access control, and retention policies, while continuously monitoring and troubleshooting data integrity issues. + **Performance & Storage Optimization** – Develop and implement physical data models, optimize database performance (indexing, table relationships), and operate large-scale distributed/cloud-based storage solutions (Data Lakes, Hadoop, HBase, Cassandra, MongoDB, Accumulo, DynamoDB). + **Innovation & Tool Evaluation** – Conduct proof-of-concept (POC) initiatives, evaluate new data tools, and provide recommendations for improvements in data management and integration. + **Documentation & Best Practices** – Maintain standard operating procedures (SOPs) and data engineering documentation to support consistency and efficiency. + **Agile Development & Automation** – Use Agile methodologies (DevOps, Scrum, Kanban) to drive automation in data integration, preparation, and infrastructure management, reducing manual effort and errors. + **Coaching & Team Development** – Provide guidance and mentorship to junior team members, fostering skill development and knowledge sharing. **RESPONSIBILITIES** **Competencies:** + **System Requirements Engineering:** Translates stakeholder needs into verifiable requirements, tracks status, and assesses impact changes. + **Collaborates:** Builds partnerships and works collaboratively with others to meet shared objectives. + **Communicates Effectively:** Delivers multi-mode communications tailored to different audiences. + **Customer Focus:** Builds strong customer relationships and provides customer-centric solutions. + **Decision Quality:** Makes good and timely decisions that drive the organization forward. + **Data Extraction:** Performs ETL activities from various sources using appropriate tools and technologies. + **Programming:** Develops, tests, and maintains code using industry standards, version control, and automation tools. + **Quality Assurance Metrics:** Measures and assesses solution effectiveness using IT Operating Model (ITOM) standards. + **Solution Documentation:** Documents knowledge gained and communicates solutions for improved productivity. + **Solution Validation Testing:** Validates configurations and solutions to meet customer requirements using SDLC best practices. + **Data Quality:** Identifies, corrects, and manages data flaws to support effective governance and decision-making. + **Problem Solving:** Uses systematic analysis to determine root causes and implement robust solutions. + **Values Differences:** Recognizes and leverages the value of diverse perspectives and cultures. **Education, Licenses, Certifications:** + Bachelor's degree in a relevant technical discipline, or equivalent experience required. + This position may require licensing for compliance with export controls or sanctions regulations. **QUALIFICATIONS** **Preferred Experience:** + **Technical Expertise** – Intermediate experience in data engineering with hands-on knowledge of SPARK, Scala/Java, MapReduce, Hive, HBase, Kafka, and SQL. + **Big Data & Cloud Solutions** – Proven ability to design and develop Big Data platforms, manage large datasets, and implement clustered compute solutions in cloud environments. + **Data Processing & Movement** – Experience developing applications requiring large-scale file movement and utilizing various data extraction tools in cloud-based environments. + **Business & Industry Knowledge** – Familiarity with analyzing complex business systems, industry requirements, and data regulations to ensure compliance and efficiency. + **Analytical & IoT Solutions** – Experience building analytical solutions with exposure to IoT technology and its integration into data engineering processes. + **Agile Development** – Strong understanding of Agile methodologies, including Scrum and Kanban, for iterative development and deployment. + **Technology Trends** – Awareness of emerging technologies and trends in data engineering, with a proactive approach to innovation and continuous learning. **Technical Skills:** + **Programming Languages:** Proficiency in Python, Java, and/or Scala. + **Database Management:** Expertise in SQL and NoSQL databases. + **Big Data Technologies:** Hands-on experience with Hadoop, Spark, Kafka, and similar frameworks. + **Cloud Services:** Experience with Azure, Databricks, and AWS platforms. + **ETL Processes:** Strong understanding of Extract, Transform, Load (ETL) processes. + **Data Replication:** Working knowledge of replication technologies like Qlik Replicate is a plus. + **API Integration:** Experience working with APIs to consume data from ERP and CRM systems. **Job** Systems/Information Technology **Organization** Cummins Inc. **Role Category** Remote **Job Type** Exempt - Experienced **ReqID** 2410681 **Relocation Package** No
Por favor confirme su dirección de correo electrónico: Send Email