New

Data & AI Engineer – Digital Public Health Program

Full-time

Hybrid

Deadline

November 30, 2025

About the organization

Antara Foundation

Antara Foundation

Organization type

Philanthropy

In A Nutshell

Location

Hybrid Delhi, India

Job Type

Full-time

Experience Level

Entry-level

Deadline to apply

November 30, 2025

Design and operationalise pipelines that bring together structured health data, conversational AI, and predictive analytics, translating complex systems into usable, ethical, and field-ready digital tools.

Responsibilities

  • Design and implement scalable data architectures integrating Sheets, Excel, and government systems into cloud databases (PostgreSQL, BigQuery).
  • Develop APIs and ETL workflows for data ingestion, transformation, and retrieval across GCP-based systems.
  • Design and orchestrate Gemini/LLM pipelines for conversational reasoning, data interpretation, and predictive insights.
  • Build ASR–LLM–TTS pipelines optimised for multilingual, low-resource contexts (Hindi + regional languages).
  • Manage embeddings and vector databases for contextual retrieval and knowledge grounding.
  • Translate backend intelligence into usable insights for health workers, dashboards, chatbots, and community feedback loops.
  • Collaborate with program teams to ensure AI models reflect real public health needs, ethics, local contexts, and are validated against field realities, including low-connectivity environments.
  • Support rapid data visualization for program dashboards and government review systems.
  • Establish data security, versioning, and model monitoring best practices.

Skillset

  • B.Tech/M.Tech in Computer Science, Data Science, or related discipline.
  • 3–6 years of experience in backend, data engineering, or AI-driven product development; exposure to health, GovTech, or social impact data preferred.
  • Languages: Python (Pandas, FastAPI, LangChain, SQLAlchemy).
  • Databases: PostgreSQL, BigQuery, SQLite; vector DBs such as Pinecone, FAISS, or Chroma.
  • AI/LLM: Gemini API, LangChain, prompt design and orchestration.
  • Speech Tech: Experience with ASR (Whisper, Google Speech) and TTS (Coqui, ElevenLabs).
  • Experience applying NLP to unstructured text/audio for community feedback and AI-enabled sensemaking.
  • Cloud: Google Cloud Platform (Cloud Run, Cloud Functions, BigQuery, Secret Manager); experience with containerized workflows using Docker.
  • Data Pipelines: End-to-end ETL development, schema design, data validation, logging, and performance monitoring.
  • Visualization: Experience with tools such as Streamlit, Gradio, Looker Studio, Power BI, and user journey mapping for rapid analytics and insight generation.

Spot any inaccurate information? Have a job to share? Let us know.