Trivandrum
5 days ago
Lead I - Software Engineering

We are looking for an Intermediate Data Engineer with 6 years of proven experience in building scalable data pipelines and managing large-scale data processing systems. The ideal candidate should have strong hands-on expertise in PySpark, SQL, and cloud platforms (preferably GCP), along with experience working on big data technologies and orchestration tools.

Key Responsibilities:

Design, implement, and optimize data pipelines for large-scale data processing.

Develop and maintain ETL/ELT workflows using Spark, Hadoop, Hive, and Airflow.

Collaborate with data scientists, analysts, and engineers to ensure data availability and quality.

Write efficient and optimized SQL queries for data extraction, transformation, and analysis.

Leverage PySpark and cloud tools (preferably Google Cloud Platform) to build reliable and scalable solutions.

Monitor and troubleshoot data pipeline performance and reliability issues.

Required Skills:

4–6 years of experience in a Data Engineering role.

Strong hands-on experience with PySpark and SQL.

Good working knowledge of GCP or any major cloud platform (AWS, Azure).

Experience with Hadoop, Hive, and distributed data systems.

Proficiency in data orchestration tools such as Apache Airflow.

Ability to work independently in a fast-paced, agile environment.

Good to Have:

Experience with data modeling and data warehousing concepts.

Exposure to DevOps and CI/CD practices for data pipelines.

Familiarity with other programming/scripting languages (Python, Shell scripting).

Educational Qualification:

Bachelor’s or Master’s degree in Computer Science, Information Technology, Engineering, or a related field.

Por favor confirme su dirección de correo electrónico: Send Email