Hyderabad
20 hours ago
Data Engineer

We are seeking an experienced Data Engineer with 4+ years of experience in building and managing data pipelines across diverse data sources. The ideal candidate will have hands-on expertise in ETL tools, cloud platforms, and programming languages such as Python, PySpark, and SQL. This role requires strong technical proficiency, problem-solving ability, and a deep understanding of data warehousing concepts, SCD principles, and performance optimization in large-scale environments.

Key Responsibilities

Data Pipeline Development: Design, code, and test scalable data pipelines for ingesting, wrangling, transforming, and joining data from multiple sources.

ETL & Integration: Implement ETL processes using tools such as Informatica, AWS Glue, Databricks, DataProc, or Apache Airflow to enable efficient data movement.

Collaboration: Partner with data analysts, data scientists, and business stakeholders to ensure data accessibility, quality, and security.

Data Quality & Validation: Establish and automate validation checks to maintain data accuracy, completeness, and consistency.

Cloud & Storage Solutions: Develop and manage data storage solutions (relational DBs, NoSQL, data lakes) on AWS, Azure, or GCP.

Documentation: Create and maintain technical documentation including source-target mappings, test cases, and results.

Testing & Debugging: Conduct unit tests, validate performance of data processes, and troubleshoot production issues.

Defect Management: Identify, fix, and retest defects in accordance with project standards.

Continuous Improvement: Optimize pipelines for efficiency (reduced resource consumption, faster run times) and implement automation.

Knowledge Sharing: Contribute to project documentation, SharePoint, and knowledge repositories.

Required Qualifications

Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).

4+ years of experience in the full development lifecycle of data engineering solutions.

Strong proficiency in SQL, Python, and PySpark.

Hands-on experience with ETL tools: Informatica, AWS Glue, Databricks, DataProc, Apache Airflow, or Azure ADF.

Strong understanding of data warehousing principles, SCD concepts, and performance tuning.

Familiarity with cloud platforms (AWS, Azure, GCP) and their data services (Glue, BigQuery, DataFlow, Lake Formation, etc.).

Experience with CI/CD, Git, Terraform, CloudFormation, or CDK for infrastructure as code.

Ability to explain solutions to both technical and non-technical stakeholders.

Strong analytical and problem-solving skills.

Desired Skills

AWS Solutions Architect or other cloud/data certifications.

Familiarity with TDD/BDD methodologies.

Experience with RESTful APIs and microservice architectures.

Exposure to SonarQube, Veracode, or similar tools for code quality/security.

Experience with data engineering in Agile/Scrum environments.

Por favor confirme su dirección de correo electrónico: Send Email