Bayan Lepas
16 days ago
Trainee - Data Engineering

Role Proficiency:

Entry-level position with the primary responsibility for developing and maintaining data pipelines for ingesting wrangling transforming and joining data from various sources. Should possess foundational skills in ETL tools such as Informatica Glue Databricks and DataProc along with coding capabilities in Python PySpark and SQL. Works under the direct supervision of a lead.

Outcomes:

Build data pipelines following instructions provided by the team lead. Identify and extract data from various sources transforming and loading it into data warehouses or lakes. Assist in monitoring data pipelines for performance and errors.

Measures of Outcomes:

Adherence to data engineering processes and standards including coding standards. Timely completion of tasks according to the schedule. Compliance with SLAs where applicable alongside a reduction in the recurrence of known defects. Quality of delivery including checks for errors in data pipelines transformations and presentations.

Outputs Expected:

Coding Standards & Practices:

Learn and apply data engineering coding standards and best practices in developing data pipelines and processes. Develop data processing code with guidance
ensuring it meets specified requirements.


Knowledge Management & Capability Development:

Obtain relevant technology certifications to enhance expertise in data engineering.

Skill Examples:

Ability to write and optimize basic SQL queries for data extraction manipulation and analysis. Understanding of Extract Transform Load (ETL) or Extract Load transform (ELT) processes with basic experience in working with ETL tools to build data pipelines. Competency in a programming language like Python and familiarity with libraries such as Pandas or PySpark for data processing tasks.

Knowledge Examples:

Knowledge Examples

Basic familiarity with cloud services (AWS Azure GCP) particularly those related to data storage data processing and data warehousing.

Additional Comments:

Entry-level position with the primary responsibility for developing and maintaining data pipelines for ingesting, wrangling, transforming, and joining data from various sources. Should possess foundational skills in ETL tools such as Informatica, Glue, Databricks, and DataProc, along with coding capabilities in Python, PySpark, and SQL. Works under the direct supervision of a lead. 1.Build data pipelines following instructions provided by the team lead. 2.Identify and extract data from various sources, transforming and loading it into data warehouses or lakes. 3.Assist in monitoring data pipelines for performance and errors.

Por favor confirme su dirección de correo electrónico: Send Email