Data Science for Social Impact in Higher Education: First Steps
05
Internships and Research
Full courses aren’t the only way to bring data science for social impact to students. Co-curricular or summer programs can be a very effective way to provide this opportunity. This approach is good for you if your institution has not developed a data science program that is ready to support a module or a course in social impact or the process of setting one up has a longer timeline. The programs mentioned below have worked well as a way to introduce the data science for social impact opportunities to students as well as complement data science programs that already have social impact courses.
MS Computational Analysis and Public Policy student and Community Data Fellow at University of Chicago
University of Chicago – Community Data Fellows
Story of the activity
The Community Data Fellows program meets organizations where they are on the data spectrum to help move their mission forward by developing feasible and sustainable data solutions. If a social impact organization doesn’t have enough data to sustain a team of students on a 10-20 week Data Science Clinic project, we assign a Community Data Fellow to work with the organization. The Community Data Fellows (CDF) program hires graduate students for 10-20 hours per week to complete projects with social impact organizations.
Inspiration to start the activity
Many social impact organizations are not yet ready for a Data Science Clinic project that can sustain a team of students for 10-20 weeks in a credit-bearing course. Many of these organizations are working on incredible societal challenges that may eventually yield outstanding data science research and clinic projects with additional capacity-building support. The CDF program meets the organizations where they are with the aim to develop and advance their data capacity and develop additional, deeper data science engagements.
Activity Example
The CDF program builds sustained bi-directional partnerships between Chicago community organizations, UChicago staff and researchers, and students. Fellows work with organizations to determine how best to use data, build data capacity across program staff, and scope projects that advance the mission. By understanding the organization’s data capacity, Fellows are able to create solutions that persist and add value at the completion of the work. This program supports UChicago DSI’s mission of building an equitable and inclusive academic data science community. Through the CDF program, we are able to channel students’ passion and skills towards having a positive impact. Key responsibilities of Fellows include:
Communicating with community partners and evaluating their data goals
Analyzing datasets and reporting insights to the organization
Scoping and fulfilling a data science work plan
Developing data pipelines, including cleaning, normalization, and organization of data
The UChicago DSI supports the CDF program through investments in technical resources, staff capacity, and relationship-building and ongoing collaborations with community partners. One of the key supports of this program has come through staffing infrastructure through a Program Manager who works closely with student Fellows, community organizations, and DSI staff data scientists and engineers to scope, manage, and oversee the implementation of projects.
As interest has grown from both students looking to engage in social impact data science and community partners seeking data project support, distributing technical expertise across interdisciplinary teams has emerged as a challenge. The CDF program has thrived when we have student Fellows with experience in communicating and scoping project deliverables, which can be a barrier to some students’ participation.
Provide incentives for DSI researchers, faculty, or postdoctoral scholars to engage CDF projects to expand research focus and prepare for Data Science Clinic.
Expand technical oversight and management across CDF projects to ensure replicability.
Develop ongoing opportunities for community partner organizations to co-create and engage with CDF Fellows and staff.
Building and sustaining strong working relationships with community partners is key. The UChicago DSI Community Data Fellows program has benefited from learning about organizational data capacity early on, to better inform our project scoping and successfully pairing students to these projects.
A-ha Moment
Compensate students for their work and set expectations that this is a job with deliverables, not an internship.
Undergraduate Data Science major at University of Chicago who participated in the UChicago Data Science Social Impact (DSSI) Summer Program
University of Chicago – Data Science Social Impact (DSSI) Summer Program
Story of the activity
The UChicago Data Science Social Impact (DSSI) Summer Program was created in partnership with faculty at City Colleges of Chicago; California State Fresno; Howard University; Morehouse College; North Carolina State University; University of Illinois Chicago; The University of Texas, San Antonio; and The University of Chicago Data Science Institute with support from data.org. This summer program was formulated to provide a unique opportunity that would:
Create a living and learning community for students newer to data science
Provide structure and preparation for students to advance social impact data science research projects
Convene faculty and students from their institutions with the explicit aim of developing new collaborations across institutions
The CAN group wanted to develop a summer program that would engage students new to data science. It was the hope that if a community of diverse, newer data science students engaged in social impact data science research, it would increase interest in students continuing their journey in a DS career. In developing this program there were several sources of inspiration and prior models to draw from. Several of the PIs themselves experienced joining an NSF-sponsored Research Experience for Undergraduates (REU) and their own careers were positively influenced by such an REU. The inspiration for the structure (an initial, short but intensive training period followed by a longer period of time for small group research) was modeled off of SIMU (an REU held at the University of Puerto Rico, Humacao), AMSSI (an REU held jointly between Cal Poly Pomona) and MSRI-UP (an REU hosted at UC Berkeley). In all cases, student housing was provided as we did at the DSSI. To ensure we were able to develop a cohesive curriculum in the initial period, the CAN faculty determined that a focus on spatial data science coursework and projects was the best approach for new data science students given the seven-week format.
Activity Example
Students arrive for the summer program and develop their community norms with faculty for how to work together and hold each other accountable. The first two weeks consist of a teaching and mentoring team engaging with the students in coursework in python, statistics, data structures and other introductory DS concepts during the first week and spatial data science during week two. At the end of week two, students are presented with the projects and complete a survey ranking their project choices. After two weeks of intensive coursework, students work in teams on social impact data science projects with external partners from climate, health, policy, human rights, and financial inclusion organizations. Students hone their data science skills in research methodologies, practices, and teamwork while learning how to engage on a real world data science problem. The remainder of the five weeks consists of small group research. The teams have dedicated research space to work 9am-5pm and meet with their research mentor 1-2 times a day. There are weekly seminars consisting of faculty and researchers that give accessible talks on a wide variety of topics in DS and AI. A culminating final presentation on the last day that consists of a presentation and poster session.
There are a number of barriers to running DSSI with funding being a critical one. The students need financial support to replace wages they would earn at summer jobs in order to engage in summer research and learning programs. Providing safe, secure on campus housing for students is key as this not only lowers the barriers to participation but also provides a living and learning environment during the program. There is a lot of planning required to running a summer program that stretches back nine months before launch. Mentors have daily meetings with each team in order to keep the project moving forward and provide guidance and timely feedback to students.
Selection of a faculty research director six to nine months in advance of the program start date allows time to select projects and develop appropriate coursework. Near peer mentors can be helpful but are not a replacement for faculty mentors with domain knowledge.
Students will struggle with learning complicated material in a short period of time.
Reinforce that the learning will continue throughout the program while working on the project and will not be mastered in the first few weeks of coursework.
Provide clear expectations on deliverables and timely feedback during mentor meetings.
Given the short timeline of the program, set up projects that fall within the same domain or subject area, such as spatial data science, to streamline the process.
A-ha Moment
Providing lunch for students not only helped foster community but removed one more thing they had to worry about during the day allowing more time and energy focused on learning.
We use cookies to optimize our website and service.
Functional and strictly necessary
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.