Machine Learning Engineer

Location

Remote

Salary

$106,000 - $120,000 / year

Job Type

Full-time

Experience Level

Mid-level

Deadline to apply

April 18, 2025

Rresponsible for maintaining data management systems and deploying machine learning models within those systems.

Responsibilities

Design, build, test, and maintain machine learning pipeline architectures (70%)

Produce high-quality, reusable code for data ingestion, validation, and processing pipelines
Architect and implement end-to-end ML pipelines including training, retraining, and inference systems for schools using the SST
Design and build APIs to easily access, integrate, and manage data from different sources
Ensure data infrastructure is in compliance with data governance and security policies
Create comprehensive documentation for data infrastructure and ML pipelines, tailored for both technical and non-technical stakeholders
Advance internal analytics reporting and automation capabilities as needed

Provide direct data support to partners (15%)

Manage initial data lifecycle processes for new school onboarding including ingestion, transfer, audit, and validation
Collaborate with data platform partners on integration and data transfer pipelines
Provide technical guidance to partners on how to share data formatted in alignment with our data model and with appropriate data governance measures
Address partner concerns regarding data security and ensure their specific requirements are satisfied
Support data science initiatives through processing, cleaning, and analyzing data as needed

Collaborate and contribute across DataKind (15%)

Support other data team members through code reviews and knowledge sharing across products
Collaborate with the Product, Engineering, and Research teams to ensure seamless integration and alignment of work
Effectively communicate project status and manage expectations with internal teams and partner organizations
Maintain accurate and current project information in project management tools like Asana

Required

Alignment with DataKind’s mission and values, including our commitment to anti-racism
Experience working across lines of difference (culture, identity, and time zone)
At least 3 years of professional work experience in developing and deploying a machine learning product at scale
Foundational understanding of machine learning and statistical methods for predictive modeling
Expert in Python
Experience with cloud computing (GCP preferred)
Experience with databases (SQL, Postgres, PySpark, and/or other data query languages)
Experience with DataBricks or a similar data intelligence platform
Experience with data warehousing, orchestration, integration, and ETL tools
Experience with modern source code management and software repository systems (i.e. Git)
Experience documenting and implementing RESTful APIs
Proven track record of successfully managing full life-cycle machine learning implementation projects with multiple stakeholders
Solid understanding of Software Engineering principles and best practices and the data science project life-cycle
Comfort and skill in communicating highly technical information to semi- and non- technical audiences
Self-motivated, results-driven, and persistent in the face of challenges

Preferred

Experience integrating data from SaaS providers
Experience in the nonprofit sector and/or in a small startup organization
Experience in scaling machine learning products, handling data quality and volume
Certifications in cloud computing
Advanced experience in machine learning—confident in applying, tuning, and evaluating a wide variety of algorithms
Experience with software development and/or web-dev work (frontends, dashboards, etc.)
Track record of strong technical writing for a variety of audiences
Proven track record of (internal or external) client service orientation

Spot any inaccurate information? Have a job to share? Let us know.