Role Proficiency:
This role requires proficiency in developing data pipelines including coding and testing for ingesting wrangling transforming and joining data from various sources. The ideal candidate should be adept in ETL tools like Informatica Glue Databricks and DataProc with strong coding skills in Python PySpark and SQL. This position demands independence and proficiency across various data domains. Expertise in data warehousing solutions such as Snowflake BigQuery Lakehouse and Delta Lake is essential including the ability to calculate processing costs and address performance issues. A solid understanding of DevOps and infrastructure needs is also required.
Outcomes:
Act creatively to develop pipelines/applications by selecting appropriate technical options optimizing application development maintenance and performance through design patterns and reusing proven solutions. Support the Project Manager in day-to-day project execution and account for the developmental activities of others. Interpret requirements create optimal architecture and design solutions in accordance with specifications. Document and communicate milestones/stages for end-to-end delivery. Code using best standards debug and test solutions to ensure best-in-class quality. Tune performance of code and align it with the appropriate infrastructure understanding cost implications of licenses and infrastructure. Create data schemas and models effectively. Develop and manage data storage solutions including relational databases NoSQL databases Delta Lakes and data lakes. Validate results with user representatives integrating the overall solution. Influence and enhance customer satisfaction and employee engagement within project teams.Measures of Outcomes:
TeamOne's Adherence to engineering processes and standards TeamOne's Adherence to schedule / timelines TeamOne's Adhere to SLAs where applicable TeamOne's # of defects post delivery TeamOne's # of non-compliance issues TeamOne's Reduction of reoccurrence of known defects TeamOne's Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirementst Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times). TeamOne's Average time to detect respond to and resolve pipeline failures or data issues. TeamOne's Number of data security incidents or compliance breaches.Outputs Expected:
Code:
Develop data processing code with guidanceensuring performance and scalability requirements are met. Define coding standards
templates
and checklists. Review code for team and peers.
Documentation:
checklists
guidelines
and standards for design/process/development. Create/review deliverable documents
including design documents
architecture documents
infra costing
business requirements
source-target mappings
test cases
and results.
Configure:
Test:
scenarios
and execution. Review test plans and strategies created by the testing team. Provide clarifications to the testing team.
Domain Relevance:
leveraging a deeper understanding of business needs. Learn more about the customer domain and identify opportunities to add value. Complete relevant domain certifications.
Manage Project:
Manage Defects:
Estimate:
and plan resources for projects.
Manage Knowledge:
SharePoint
libraries
and client universities. Review reusable documents created by the team.
Release:
Design:
LLD
SAD)/architecture for applications
business components
and data models.
Interface with Customer:
Manage Team:
Certifications:
Skill Examples:
Proficiency in SQL Python or other programming languages used for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Experience in performance tuning. Experience in data warehouse design and cost improvements. Apply and optimize data models for efficient storage retrieval and processing of large datasets. Communicate and explain design/development aspects to customers. Estimate time and resource requirements for developing/debugging features/components. Participate in RFP responses and solutioning. Mentor team members and guide them in relevant upskilling and certification.Knowledge Examples:
Knowledge Examples
Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/Dataflow Azure ADF and ADLF. Proficient in SQL for analytics and windowing functions. Understanding of data schemas and models. Familiarity with domain-related data. Knowledge of data warehouse optimization techniques. Understanding of data security concepts. Awareness of patterns frameworks and automation practices.Additional Comments:
Client Job Title: Data Discovery & Classification Engineer UST Job Title: Who we are: At UST, we help the world’s best organizations grow and succeed through transformation. Bringing together the right talent, tools, and ideas, we work with our client to co-create lasting change. Together, with over 30,000 employees in over 25 countries, we build for boundless impact—touching billions of lives in the process. Visit us at UST.com. The Opportunity: UST is looking for seeking a skilled Data Discovery & Classification Engineer to join our data management team. The ideal candidate will be responsible for identifying, classifying, and protecting sensitive data across the organization, ensuring compliance with data protection regulations and internal policies. Key Roles & Responsibilities: • Onboarding Data Sources: o Work with IT and data teams to integrate new data sources into the existing data management platform. (e.g., Teams, OneDrive, file storage, APIs). o Monitor data source health and connectivity; resolve any ingestion issues. o Automate inventory updates where possible using scripts or tools. o Create and maintain detailed documentation for each data source records, including metadata, data lineage, along with onboarding process. • Maintaining Inventory of Data Sources: o Maintain an up-to-date inventory of all data sources, ensuring each source is accurately cataloged and classified. o Regularly monitor data sources for changes or updates, ensuring the inventory reflects the current state of data assets. o Generate reports on the status and progress of data sources, highlighting any issues or areas for improvement. • Data Classification & Tagging: o Create and maintain classification categories. o Apply appropriate tags and labels to data to facilitate easy retrieval and management. o Adjust classification rules based on changing business needs and policies. • Metrics and Reporting: o Develop dashboards and reports to track onboarding status, classification accuracy, and data inventory health. o Track key performance indicators (KPIs) such as: o Number of data sources onboarded o Percentage of classified vs. unclassified data • Automation and Process Improvement: o Identify opportunities to automate data onboarding and classification processes. o Create and maintain scripts to automate data extraction and classification. o Propose and implement improvements to data ingestion pipelines and workflows. Required Skills: • Proven experience as a Data Discovery & Classification Engineer or in a similar role. • Strong knowledge of data discovery and classification tools and technologies (e.g., Congruity 360, Microsoft Purview, Big ID etc.). • Experience with data protection tools and technologies. • Excellent analytical and problem-solving skills. • Ability to work independently and as part of a team. Qualification: • Bachelor’s degree in Computer Science, Information Technology, or a related field. What we believe: We’re proud to embrace the same values that have shaped UST since the beginning. Since day one, we’ve been building enduring relationships and a culture of integrity. And today, it's those same values that are inspiring us to encourage innovation from everyone, to champion diversity and inclusion and to place people at the centre of everything we do. Humility: We will listen, learn, be empathetic and help selflessly in our interactions with everyone. Humanity: Through business, we will better the lives of those less fortunate than ourselves. Integrity: We honour our commitments and act with responsibility in all our relationships. Equal Employment Opportunity Statement UST is an Equal Opportunity Employer. We believe that no one should be discriminated against because of their differences, such as age, disability, ethnicity, gender, gender identity and expression, religion, or sexual orientation. All employment decisions shall be made without regard to age, race, creed, colour, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law. UST reserves the right to periodically redefine your roles and responsibilities based on the requirements of the organization and/or your performance. • To support and promote the values of UST. • Comply with all Company policies and procedures