In A Nutshell
Lead large-scale government data projects, including cross-functional team coordination, client relationship management with federal agencies, and budget management.
Responsibilities
- Create and implement record linkage algorithms using Python, integrating administrative records and survey data to enhance the value of government agency data products.
- Modernize outdated government data systems by transitioning legacy codebases from SAS/STATA to R/Python.
- Develop and deliver educational materials and training programs on Python, R, and SQL coding best practices to enhance government agency staff capabilities.
- Train and mentor government analysts at federal and state agencies, strengthening data science analytical capabilities across agencies.
- Build and maintain interactive dashboard visualizations using tools like R Shiny to deliver dynamic, data-driven reports that inform agency leadership and policymakers in decision-making.
- Conduct text analysis and natural language processing (NLP) to extract insights from unstructured data.
- Author and contribute to federally funded reports and research papers documenting the impact of government data science initiatives.
Skillset
- Requires a Master’s degree or foreign equivalent in Applied Data Science, Computational Linguistics, or a closely related computer science or quantitative social science field and five (5) years of experience as a data scientist, research scientist, or in a related role working with administrative and survey data, including the following experience:
- 5 years of Python programming experience in government data analysis
- 3 years of R and SQL programming in government data cleaning, database work, analysis, and visualization
- 2 years of project management experience for government-funded data projects, including budget management
- 2 years of large-scale text analysis, focusing on government data applications
- 2 years using record linkage methodologies in Python, integrating micro-level confidential government survey data and administrative records
- 2 years delivering applied data science training programs to federal government agencies (e.g., NSF, DOL, USDA, ACF) and state-level agencies
- 1 year modernizing federal government agencies’ legacy codebases, migrating from SAS and STATA to R and Python
- Demonstrated experience in academic research, evidenced by at least one published paper in a peer-reviewed journal or government agency publication and a presentation at a professional conference–published work and conference presentation should be related to data science.