Infra SWE V
Insight Global
Job Description
Were hiring a Senior Infrastructure Software Engineer to join a high-impact ML infrastructure team at a top research and tech company. This role is ideal for engineers who thrive in high-autonomy environments and have deep experience building infrastructure at scale. Youll spend 8090% of your time coding in Python, contributing directly to the development and improvement of core ML compute infrastructure. Youll help maintain and expand a bespoke GPU Kubernetes cluster, working on systems that support a wide range of research customers.
Write clean, scalable Python code to enhance internal ML infrastructure systems
Own and operate a custom GPU Kubernetes cluster, including data catalog and caching storage components
Build features that support onboarding and performance needs of ML and research users
Improve performance, resolve production issues, and optimize resource usage
Contribute to and improve pipelines (data ingestion, compute scheduling, etc.)
Work with tools like Docker, Kubernetes, and automated testing frameworks
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form (https://airtable.com/app21VjYyxLDIX0ez/shrOg4IQS1J6dRiMo) . The EEOC "Know Your Rights" Poster is available here (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf) .
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .
Skills and Requirements
5+ years of experience as a Software Engineer, ideally with infrastructure and/or platform teams
Strong proficiency in Python (this is a Python-only role with a mixture of writing code from scratch as well as debugging and performance improvements)
Hands-on experience with Kubernetes clusters, ideally at scale
Familiarity with Docker, automated testing, and ML infrastructure components
Ability to operate independently and deliver end-to-end projects with minimal oversight Experience in performance tuning and supporting internal research or ML teams null
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to HR@insightglobal.com.
Por favor confirme su dirección de correo electrónico: Send Email