Machine Learning Engineer, Trust & Safety

In A Nutshell

Location

Hybrid New York or San Francisco, NY or CA, USA

Salary

$340,000-$425,000

Job Type

Full-time

Experience Level

Mid-level

Visa Sponsorship

Available

Deadline to apply

February 21, 2025

Work to train models which detect harmful behaviors and help ensure user well-being and uphold Anthropic’s principles of safety, transparency, and oversight while enforcing terms of service and acceptable use policies.

Responsibilities

Build machine learning models to detect unwanted or anomalous behaviors from users and API partners, and integrate them into our production system.
Improve our automated detection and enforcement systems as needed.
Analyze user reports of inappropriate accounts and build machine learning models to detect similar instances proactively.
Surface abuse patterns to our research teams to harden models at the training stage.

Skillset

Have 4+ years of experience in a research/ML engineering or an applied research scientist position, preferably with a focus on trust and safety.
Have proficiency in SQL, Python, and data analysis/data mining tools.
Have proficiency in building trust and safety AI/ML systems, such as behavioral classifiers or anomaly detection.
Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders.
Care about the societal impacts and long-term implications of your work.

Apply Now

Spot any inaccurate information? Have a job to share? Let us know.

Machine Learning Engineer, Trust & Safety

About the organization

Anthropic

In A Nutshell

Responsibilities

Skillset

Related Jobs

Stay Connected

About the organization

Anthropic

In A Nutshell

Responsibilities

Skillset

Related Jobs

Research & Evaluation Manager

Manager, Data Evaluation and Technology Integration

Manager, Data Visualization