Chennai, IND
20 hours ago
Data Engineer - Controls Technology
We are seeking a highly skilled and hands-on Data Engineer to join Controls Technology to support the design, development, and implementation of our next-generation Data Mesh and Hybrid Cloud architecture. This role is critical in building scalable, resilient, and future-proof data pipelines and infrastructure that enable the seamless integration of Controls Technology data within a unified platform. The Data Engineer will work closely with the Data Mesh and Cloud Architect Lead to implement data products, ETL/ELT pipelines, hybrid cloud integrations, and governance frameworks that support data-driven decision-making across the enterprise. **Key Responsibilities:** **Data Pipeline Development:** + Design, build, and optimize ETL/ELT pipelines for structured and unstructured data. + Develop real-time and batch data ingestion pipelines using distributed data processing frameworks. + Ensure pipelines are highly performant, cost-efficient, and secure. **Apache Iceberg & Starburst Integration:** + Work extensively with Apache Iceberg for data lake storage optimization and schema evolution. + Manage Iceberg Catalogs and ensure seamless integration with query engines. + Configure and maintain Hive MetaStore (HMS) for Iceberg-backed tables and ensure proper metadata management. + Utilize Starburst and Stargate to enable distributed SQL-based analytics and seamless data federation. + Optimize performance tuning for large-scale querying and federated access to structured and semi-structured data. **Data Mesh Implementation:** + Implement Data Mesh principles by developing domain-specific data products that are discoverable, interoperable, and governed. + Collaborate with data domain owners to enable self-service data access while ensuring consistency and quality. **Hybrid Cloud Data Integration:** + Develop and manage data storage, processing, and retrieval solutions across AWS and on-premise environments. + Work with cloud-native tools such as AWS S3, RDS, Lambda, Glue, Redshift, and Athena to support scalable data architectures. + Ensure hybrid cloud data flows are optimized, secure, and compliant with organizational standards. **Data Governance & Security:** + Implement data governance, lineage tracking, and metadata management solutions. + Enforce security best practices for data encryption, role-based access control (RBAC), and compliance with policies such as GDPR and CCPA. **Performance Optimization & Monitoring:** + Monitor and optimize data workflows, performance tuning of queries, and resource utilization. + Implement logging, alerting, and monitoring solutions using CloudWatch, Prometheus, or Grafana to ensure system health. **Collaboration & Documentation:** + Work closely with data architects, application teams, and business units to ensure seamless integration of data solutions. + Maintain clear documentation of data models, transformations, and architecture for internal reference and governance. **Required Technical Skills:** **Programming & Scripting:** + Strong proficiency in Python, SQL, and Shell scripting. + Experience with Scala or Java is a plus. **Data Processing & Storage:** + Hands-on experience with Apache Spark, Kafka, Flink, or similar distributed processing frameworks. + Strong knowledge of relational (PostgreSQL, MySQL, Oracle) and NoSQL databases (DynamoDB, MongoDB). + Expertise in Apache Iceberg for managing large-scale data lakes, schema evolution, and ACID transactions. + Experience working with Iceberg Catalogs, Hive MetaStore (HMS), and integrating Iceberg-backed tables with query engines. + Familiarity with Starburst and Stargate for federated querying and cross-platform data access. **Cloud & Hybrid Architecture:** + Experience working with AWS data services (S3, Redshift, Glue, Athena, EMR, RDS). + Understanding of hybrid data storage and integration between on-prem and cloud environments. **Infrastructure as Code (IaC) & DevOps:** + Experience with Terraform, AWS CloudFormation, or Kubernetes for provisioning infrastructure. + CI/CD pipeline experience using GitHub Actions, Jenkins, or GitLab CI/CD. **Data Governance & Security:** + Familiarity with data cataloging, lineage tracking, and metadata management. + Understanding of RBAC, IAM roles, encryption, and compliance frameworks (GDPR, SOC2, etc.). **Required Soft Skills:** + Problem-Solving & Analytical Thinking - Ability to troubleshoot complex data issues and optimize workflows. + Collaboration & Communication - Comfortable working with cross-functional teams and articulating technical concepts to non-technical stakeholders. + Ownership & Proactiveness - Self-driven, detail-oriented, and able to take ownership of tasks with minimal supervision. + Continuous Learning - Eager to explore new technologies, improve skill sets, and stay ahead of industry trends. **Qualifications:** + 4-6 years of experience in data engineering, cloud infrastructure, or distributed data processing. + Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or a related field. + Hands-on experience with data pipelines, cloud services, and large-scale data platforms. + Strong foundation in SQL, Python, Apache Iceberg, Starburst, and cloud-based data solutions (AWS preferred), Apache Airflow Orchestration ------------------------------------------------------ **Job Family Group:** Technology ------------------------------------------------------ **Job Family:** Data Architecture ------------------------------------------------------ **Time Type:** Full time ------------------------------------------------------ **Most Relevant Skills** Please see the requirements listed above. ------------------------------------------------------ **Other Relevant Skills** For complementary skills, please see above and/or contact the recruiter. ------------------------------------------------------ _Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law._ _If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review_ _Accessibility at Citi (https://www.citigroup.com/citi/accessibility/application-accessibility.htm)_ _._ _View Citi’s_ _EEO Policy Statement (https://www.citigroup.com/global/eeo-aa-policy)_ _and the_ _Know Your Rights (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf)_ _poster._ Citi is an equal opportunity and affirmative action employer. Minority/Female/Veteran/Individuals with Disabilities/Sexual Orientation/Gender Identity.
Por favor confirme su dirección de correo electrónico: Send Email