Senior Data Engineer

Full-time

Hybrid

Deadline

August 3, 2024

About the organization

civic data lab logo

CivicDataLab

Organization type

Research Institution

In A Nutshell

Location

Hybrid New Delhi, India

Salary

₹9,00,000 - ₹12,00,000

Job Type

Full-time

Experience Level

Entry-level

Deadline to apply

August 3, 2024

Support building robust and automated tools focusing on data analytics, encompassing data curation, data cleansing, data standardization, and sophisticated data wrangling.

Responsibilities

  • Design and develop scalable data orchestration pipelines using Prefect and Apache Airflow.
  • Create and oversee data APIs responsible for collecting, managing, and analysing data from diverse public data sources.
  • Standardize metadata of open datasets by ensuring compliance with the DCAT metadata standard.
  • Collaborate with our partners to perform in-depth Exploratory Data Analysis (EDA) of various datasets.
  • Engage in the development of database models in accordance with the specific project requirements.
  • Maintain and monitor our existing open data platforms like Open Budgets India, Justice Hub, Open Contracting India.
  • Engage regularly with our diverse stakeholders and open-source communities to discuss and create reusable resources around use-cases of public data, data engineering best practices, and guidebooks.
  • Thoroughly document code, processes, and all activities performed by the data team, ensuring clarity and comprehensiveness. This includes documenting algorithms, methodologies, data transformations, and the overall workflow.

Skillset

  • 2+ years of thorough experience working with Python and SQL.
  • Understanding of message brokers such as RabbitMQ.
  • Knowledge of open-source data scraping frameworks and tools such as Selenium and Scrapy.
  • Experience with building an end to end ETL pipeline.
  • Familiarity with building database systems.
  • Knowledge of API or Stream-based data extraction processes.
  • Comprehensive knowledge of a Git-based workflow.
  • Comprehensive knowledge of metadata standards such as DCAT.

Spot any inaccurate information? Have a job to share? Let us know.