Trivandrum
1 day ago
Lead I - Software Engineering

Job Summary:

We are seeking a Senior Data Engineer with strong hands-on experience in PySpark, Big Data technologies, and cloud platforms (preferably GCP). The ideal candidate will design, implement, and optimize scalable data pipelines while driving technical excellence and process improvements. You will collaborate with cross-functional teams to solve complex data problems and ensure delivery of high-quality, cost-effective data solutions.

Roles & Responsibilities:

Design & Development:

Develop scalable and efficient data pipelines using PySpark, Hive, SQL, Spark, and Hadoop.

Translate high-level business requirements and design documents (HLD/LLD) into technical specifications and implementation.

Create and maintain architecture and design documentation.

Performance Optimization & Quality:

Monitor, troubleshoot, and optimize data workflows for cost, performance, and reliability.

Perform root cause analysis (RCA) on defects and implement mitigation strategies.

Ensure adherence to coding standards, version control practices, and testing protocols.

Collaboration & Stakeholder Engagement:

Interface with product managers, data stewards, and customers to clarify requirements.

Conduct technical presentations, design walkthroughs, and product demos.

Provide timely updates, escalations, and support during UAT and production rollouts.

Project & People Management:

Manage delivery of data modules/user stories with a focus on timelines and quality.

Set and review FAST goals for self and team; provide mentorship and technical guidance.

Maintain team engagement and manage team member aspirations through regular feedback and career support.

Compliance & Knowledge Management:

Ensure compliance with mandatory trainings and engineering processes.

Contribute to and consume project documentation, templates, checklists, and domain-specific knowledge.

Review and approve reusable assets developed by the team.

Must-Have Skills:

6+ years of experience in Data Engineering or related roles.

Strong proficiency in PySpark, SQL, Spark, Hive, and the Hadoop ecosystem.

Hands-on experience with Google Cloud Platform (GCP) or equivalent cloud services (e.g., AWS, Azure).

Expertise in designing, building, testing, and deploying large-scale data processing systems.

Sound understanding of data architecture, ETL frameworks, and batch/streaming data pipelines.

Strong knowledge of Agile methodologies (Scrum/Kanban).

Experience with code reviews, version control (Git), and CI/CD tools.

Excellent communication skills – both verbal and written.

Good-to-Have Skills:

GCP Professional Data Engineer Certification or equivalent.

Experience with Airflow, Dataflow, BigQuery, or similar GCP-native tools.

Knowledge of data modeling techniques and data governance.

Exposure to domain-specific projects (e.g., BFSI, Healthcare, Retail).

Experience with Docker, Kubernetes, or other containerization tools.

Working knowledge of test automation and performance testing frameworks.

Por favor confirme su dirección de correo electrónico: Send Email