Role Proficiency:
This role requires proficiency in developing data pipelines including coding and testing for ingesting wrangling transforming and joining data from various sources. The ideal candidate should be adept in ETL tools like Informatica Glue Databricks and DataProc with strong coding skills in Python PySpark and SQL. This position demands independence and proficiency across various data domains. Expertise in data warehousing solutions such as Snowflake BigQuery Lakehouse and Delta Lake is essential including the ability to calculate processing costs and address performance issues. A solid understanding of DevOps and infrastructure needs is also required.
Outcomes:
Act creatively to develop pipelines/applications by selecting appropriate technical options optimizing application development maintenance and performance through design patterns and reusing proven solutions. Support the Project Manager in day-to-day project execution and account for the developmental activities of others. Interpret requirements create optimal architecture and design solutions in accordance with specifications. Document and communicate milestones/stages for end-to-end delivery. Code using best standards debug and test solutions to ensure best-in-class quality. Tune performance of code and align it with the appropriate infrastructure understanding cost implications of licenses and infrastructure. Create data schemas and models effectively. Develop and manage data storage solutions including relational databases NoSQL databases Delta Lakes and data lakes. Validate results with user representatives integrating the overall solution. Influence and enhance customer satisfaction and employee engagement within project teams.Measures of Outcomes:
TeamOne's Adherence to engineering processes and standards TeamOne's Adherence to schedule / timelines TeamOne's Adhere to SLAs where applicable TeamOne's # of defects post delivery TeamOne's # of non-compliance issues TeamOne's Reduction of reoccurrence of known defects TeamOne's Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirementst Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times). TeamOne's Average time to detect respond to and resolve pipeline failures or data issues. TeamOne's Number of data security incidents or compliance breaches.Outputs Expected:
Code:
Develop data processing code with guidanceensuring performance and scalability requirements are met. Define coding standards
templates
and checklists. Review code for team and peers.
Documentation:
checklists
guidelines
and standards for design/process/development. Create/review deliverable documents
including design documents
architecture documents
infra costing
business requirements
source-target mappings
test cases
and results.
Configure:
Test:
scenarios
and execution. Review test plans and strategies created by the testing team. Provide clarifications to the testing team.
Domain Relevance:
leveraging a deeper understanding of business needs. Learn more about the customer domain and identify opportunities to add value. Complete relevant domain certifications.
Manage Project:
Manage Defects:
Estimate:
and plan resources for projects.
Manage Knowledge:
SharePoint
libraries
and client universities. Review reusable documents created by the team.
Release:
Design:
LLD
SAD)/architecture for applications
business components
and data models.
Interface with Customer:
Manage Team:
Certifications:
Skill Examples:
Proficiency in SQL Python or other programming languages used for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Experience in performance tuning. Experience in data warehouse design and cost improvements. Apply and optimize data models for efficient storage retrieval and processing of large datasets. Communicate and explain design/development aspects to customers. Estimate time and resource requirements for developing/debugging features/components. Participate in RFP responses and solutioning. Mentor team members and guide them in relevant upskilling and certification.Knowledge Examples:
Knowledge Examples
Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/Dataflow Azure ADF and ADLF. Proficient in SQL for analytics and windowing functions. Understanding of data schemas and models. Familiarity with domain-related data. Knowledge of data warehouse optimization techniques. Understanding of data security concepts. Awareness of patterns frameworks and automation practices.Additional Comments:
As a Senior Data Engineer, you are responsible for expanding and improving our data and data pipeline architecture, as well as enhancing data flows and data collection to ensure our data delivery architecture is optimal, safe, compliant and consistent throughout initiatives. As the key interface to operationalize data and analytics, this role requires collaborative skills to evangelize business stakeholders of effective data & analytics practices, and help data consumers optimize their models for quality, security and governance. You are an experienced data wrangler who enjoys optimizing existing data systems or building them from scratch. You are autonomous and comfortable supporting the data needs of multiple teams and products. PRINCIPAL DUTIES AND RESPONSIBILITIES 1. Build data pipelines: Architecting, creating, maintaining and optimizing data pipelines is the primary responsibility of the data engineer. 2. Drive automation through effective metadata management: automate the most common, repeatable and tedious data preparation and integration tasks, in order to minimize manual processes and errors and improve productivity. The data engineer also assists with renovating the data management infrastructure to drive automation in data integration and management. 3. Collaborate across departments: work collaboratively with varied stakeholders (notably data analysts and scientists) to refine their data consumption requirements. 4. Educate & train: be knowledgeable about how to address data topics, including using data & domain understanding to address new data requirements, proposing innovative data ingestion, preparation, integration and operationalization, and training stakeholders in data pipelining & preparation 5. Participate in ensuring compliant data use: ensure that data users and consumers use the data provisioned to them responsibly. Work with data governance teams, and participate in vetting and promoting content to the curated data catalog for governed reuse. 6. Become a data and analytics evangelist: The data engineer is a blend of “analytics evangelist”, “data guru” and “fixer.” This role will promote the available data and analytics capabilities and expertise to business leaders to help them leverage these capabilities in achieving business goals. Education and Experience Required: • 6+ years of work experience in data management including Big Data processing, ETL frameworks, data integration, optimization and data quality, of which 3+ years supporting data and analytics initiatives for cross-functional teams • Responsible for designing, building, and testing several complex ETL workflows preferably using Alteryx. • Experienced with SQL/NoSQL databases, structured and unstructured data processing. • Experienced with SQL optimization and performance tuning. • Experienced with Logical and Physical data modeling and exposure to data modeling tools. • Foundational knowledge of various data management architectures like data warehouse, data lake and data hub, and supporting processes like data integration, data governance, data lineage and metadata management. • Experienced with: o Big Data processing tools and frameworks o Data preparation/ETL tools (Alteryx,Informatica,Datastage, …) o Data Visualisation tools (Power BI, Tableau, …) o Working with SQL on Hadoop tools and technologies (HIVE, Impala, Presto ,...) o Agile Development o Public Cloud data environments (AWS,Azure,…) OR hybrid environments. • Preferred to have experience on: o Any programming languages (JAVA,Python,Node.js,…) o DevOps capabilities like version control, automated builds, testing and release management capabilities with Git, Jenkins. o CI/CD deployment practices. o Certification on Alteryx, Tableau or similar tools o Low-code platform (e.g.: Power Automate, UiPath, Retool, etc.) • Bachelor’s degree in STEM or a related technical field, or equivalent work experience Skills and Abilities Required: • Strong experience collaborating with a wide range of IT and business stakeholders • Strong verbal and written communication demonstrating ability to efficiently share information, influence decisions, negotiate and network • Advanced analytical and problem-solving skills, with a strategic view to conceive solutions that adapt over time or can be reused across different initiatives • Organizational skills with attention to details and great documentation skills • Ability to adapt quickly to new methods and work under tight deadlines • Ability to set goals and handle multiple tasks, stakeholders and projects simultaneously • Ability to bridge deep data infrastructure, governance with low-code agility to build scalable data foundations and rapidly deploy applications that turn complex reinsurance data into actionable insights for business teams.