Machine Learning Engineer, Trust & Safety

Full-time

Hybrid

Deadline

February 21, 2025

About the organization

Anthropic

Anthropic

Organization type

Social Impact Organization

In A Nutshell

Location

Hybrid New York or San Francisco, NY or CA, USA

Salary

$340,000-$425,000

Job Type

Full-time

Experience Level

Mid-level

Visa Sponsorship

Available

Deadline to apply

February 21, 2025

Work to train models which detect harmful behaviors and help ensure user well-being and uphold Anthropic’s principles of safety, transparency, and oversight while enforcing terms of service and acceptable use policies.

Responsibilities

  • Build machine learning models to detect unwanted or anomalous behaviors from users and API partners, and integrate them into our production system.
  • Improve our automated detection and enforcement systems as needed.
  • Analyze user reports of inappropriate accounts and build machine learning models to detect similar instances proactively.
  • Surface abuse patterns to our research teams to harden models at the training stage.

Skillset

  • Have 4+ years of experience in a research/ML engineering or an applied research scientist position, preferably with a focus on trust and safety.
  • Have proficiency in SQL, Python, and data analysis/data mining tools.
  • Have proficiency in building trust and safety AI/ML systems, such as behavioral classifiers or anomaly detection.
  • Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders.
  • Care about the societal impacts and long-term implications of your work.

Spot any inaccurate information? Have a job to share? Let us know.