Data Engineer - Controls Technology
Citigroup
We are seeking a highly skilled and hands-on Data Engineer to join Controls Technology to support the design, development, and implementation of our next-generation Data Mesh and Hybrid Cloud architecture. This role is critical in building scalable, resilient, and future-proof data pipelines and infrastructure that enable the seamless integration of Controls Technology data within a unified platform. The Data Engineer will work closely with the Data Mesh and Cloud Architect Lead to implement data products, ETL/ELT pipelines, hybrid cloud integrations, and governance frameworks that support data-driven decision-making across the enterprise.
**Key Responsibilities:**
**Data Pipeline Development:**
+ Design, build, and optimize ETL/ELT pipelines for structured and unstructured data.
+ Develop real-time and batch data ingestion pipelines using distributed data processing frameworks.
+ Ensure pipelines are highly performant, cost-efficient, and secure.
**Apache Iceberg & Starburst Integration:**
+ Work extensively with Apache Iceberg for data lake storage optimization and schema evolution.
+ Manage Iceberg Catalogs and ensure seamless integration with query engines.
+ Configure and maintain Hive MetaStore (HMS) for Iceberg-backed tables and ensure proper metadata management.
+ Utilize Starburst and Stargate to enable distributed SQL-based analytics and seamless data federation.
+ Optimize performance tuning for large-scale querying and federated access to structured and semi-structured data.
**Data Mesh Implementation:**
+ Implement Data Mesh principles by developing domain-specific data products that are discoverable, interoperable, and governed.
+ Collaborate with data domain owners to enable self-service data access while ensuring consistency and quality.
**Hybrid Cloud Data Integration:**
+ Develop and manage data storage, processing, and retrieval solutions across AWS and on-premise environments.
+ Work with cloud-native tools such as AWS S3, RDS, Lambda, Glue, Redshift, and Athena to support scalable data architectures.
+ Ensure hybrid cloud data flows are optimized, secure, and compliant with organizational standards.
**Data Governance & Security:**
+ Implement data governance, lineage tracking, and metadata management solutions.
+ Enforce security best practices for data encryption, role-based access control (RBAC), and compliance with policies such as GDPR and CCPA.
**Performance Optimization & Monitoring:**
+ Monitor and optimize data workflows, performance tuning of queries, and resource utilization.
+ Implement logging, alerting, and monitoring solutions using CloudWatch, Prometheus, or Grafana to ensure system health.
**Collaboration & Documentation:**
+ Work closely with data architects, application teams, and business units to ensure seamless integration of data solutions.
+ Maintain clear documentation of data models, transformations, and architecture for internal reference and governance.
**Required Technical Skills:**
**Programming & Scripting:**
+ Strong proficiency in Python, SQL, and Shell scripting.
+ Experience with Scala or Java is a plus.
**Data Processing & Storage:**
+ Hands-on experience with Apache Spark, Kafka, Flink, or similar distributed processing frameworks.
+ Strong knowledge of relational (PostgreSQL, MySQL, Oracle) and NoSQL databases (DynamoDB, MongoDB).
+ Expertise in Apache Iceberg for managing large-scale data lakes, schema evolution, and ACID transactions.
+ Experience working with Iceberg Catalogs, Hive MetaStore (HMS), and integrating Iceberg-backed tables with query engines.
+ Familiarity with Starburst and Stargate for federated querying and cross-platform data access.
**Cloud & Hybrid Architecture:**
+ Experience working with AWS data services (S3, Redshift, Glue, Athena, EMR, RDS).
+ Understanding of hybrid data storage and integration between on-prem and cloud environments.
**Infrastructure as Code (IaC) & DevOps:**
+ Experience with Terraform, AWS CloudFormation, or Kubernetes for provisioning infrastructure.
+ CI/CD pipeline experience using GitHub Actions, Jenkins, or GitLab CI/CD.
**Data Governance & Security:**
+ Familiarity with data cataloging, lineage tracking, and metadata management.
+ Understanding of RBAC, IAM roles, encryption, and compliance frameworks (GDPR, SOC2, etc.).
**Required Soft Skills:**
+ Problem-Solving & Analytical Thinking - Ability to troubleshoot complex data issues and optimize workflows.
+ Collaboration & Communication - Comfortable working with cross-functional teams and articulating technical concepts to non-technical stakeholders.
+ Ownership & Proactiveness - Self-driven, detail-oriented, and able to take ownership of tasks with minimal supervision.
+ Continuous Learning - Eager to explore new technologies, improve skill sets, and stay ahead of industry trends.
**Qualifications:**
+ 4-6 years of experience in data engineering, cloud infrastructure, or distributed data processing.
+ Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or a related field.
+ Hands-on experience with data pipelines, cloud services, and large-scale data platforms.
+ Strong foundation in SQL, Python, Apache Iceberg, Starburst, and cloud-based data solutions (AWS preferred), Apache Airflow Orchestration
------------------------------------------------------
**Job Family Group:**
Technology
------------------------------------------------------
**Job Family:**
Data Architecture
------------------------------------------------------
**Time Type:**
Full time
------------------------------------------------------
**Most Relevant Skills**
Please see the requirements listed above.
------------------------------------------------------
**Other Relevant Skills**
For complementary skills, please see above and/or contact the recruiter.
------------------------------------------------------
_Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law._
_If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review_ _Accessibility at Citi (https://www.citigroup.com/citi/accessibility/application-accessibility.htm)_ _._
_View Citi’s_ _EEO Policy Statement (https://www.citigroup.com/global/eeo-aa-policy)_ _and the_ _Know Your Rights (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf)_ _poster._
Citi is an equal opportunity and affirmative action employer.
Minority/Female/Veteran/Individuals with Disabilities/Sexual Orientation/Gender Identity.
Por favor confirme su dirección de correo electrónico: Send Email