Why a Million Brains are Better than One

Building sociotech through diverse, interdisciplinary teams of professionals

Photo by DS stories from Pexels.

All technology is socially determined. Because technology is built by humans, it automatically inherits the social and cultural assumptions of its makers—ideas of what is needed, what is important, and how things should look and work—all of it influenced by deep and often unconscious biases.

All technology is also socially determining. Certain features of technology influence how individuals behave when using it—and even when they are not. Just think of the impact social media platforms had on the spread of COVID-19 vaccine misinformation.

But clearly not all technology is equally effective at influencing society, nor is it as significantly determined by society. Compare a utility app that allows you to do something specific like buy tickets with TikTok or Facebook.

Platforms that use artificial intelligence (AI) and other data-driven technologies tend to be at the top end of this scale. After all, what could be more socially determined and socially determining than systems that function by analyzing data from social behavior, learning to see patterns and making predictions, which are then used to make decisions about that society?

Think about it. When such powerful technology gets deployed, it becomes inevitably enmeshed in broader social activity that both generates and consumes data, creating a complex system that includes everything from the humans providing the data, to those collecting and analyzing the data, to the decisionmakers creating solutions based on the findings.

At each level, biases reflecting society as a whole are likely. At each level, culture and context matters.

Such data-driven technologies are more rightly described as socio-technical, sociotech, for short. This is a term long-known in academic disciplines like Science and Technology Studies (STS), but it’s time to take it mainstream.

We need a new generation of data scientists who, in addition to their technical skills, also have social science skills that allow them to understand the social and cultural context of technologies

Danil-Mikhailov Danil Mikhailov, Ph.D. Executive Director data.org

Words matter. By referring to these platforms as simply ‘technology,’ we under-emphasize their social impact and devalue the complex, human-centered process of their creation and deployment. We fail to understand that the social in their make-up is every bit as important as the technical.

For example, if we truly understood AI is a form of sociotech, why would we leave the building of systems that are so enmeshed in our daily lives to a small and exclusive group of experts who are not at all representative of society? How could we expect their work would be either equitable or effective?

Similarly, is it really any wonder that teams of data, computer, and machine learning scientists with no background in social sciences or the humanities have designed systems and products replete with unintended (or intended) negative social consequences?

Since AI is a piece of sociotech rather than tech, the discipline that supports it—data science—is also revealed to have been woefully misunderstood and too narrowly defined by us so far. To truly make AI and its related data-driven sociotechnologies serve society and create positive social value, we must re-conceptualize data science more broadly, more inclusively, and in a more interdisciplinary way—and of course, train data scientists accordingly.

As I argued in my last blog post, the old archetype of a data scientist—a white male in their twenties trying to “save the world” by narrowly focusing on coding and math, will not do. As well-intentioned as many current practitioners undoubtedly are, we need a new generation of data scientists who, in addition to their technical skills, also have social science skills that allow them to understand the social and cultural context of technologies. These skills will empower them to engage with communities more equitably, to worry about and implement ethical solutions, and to understand the social consequences of systems that they create.

We need these data scientists to be as diverse across gender, race, caste, socio-economic status, and ability as the communities they serve and we need them to be local and embedded in their communities, wherever they are across the world. Only then will the data-driven sociotech they create be truly responsive to community needs.

Training data scientists well-versed in sociotech is particularly important for working in the social impact sector. Whether the focus is health or climate, social justice or financial inclusion, simplistic linear solutions will not solve complex systemic problems. We need data scientists with interdisciplinary skills who can translate between disciplines and ask the right questions in the first place.

For example, as our RECoDE report so clearly demonstrated, marginalized communities are all too often under-represented in public health data, which leads to worse health outcomes for their members. However, to successfully negotiate access to more representative data, it is imperative to first understand the history of extraction and exploitation that underpins the distrust often felt by these communities towards external experts. Without careful and respectful work to build trust and cede power to the communities over where, when and how their data is used, the data bias will persist, regardless of technology we deploy to do our analysis. That work requires are far broader range of skills than we currently teach our data scientists.

data.org is taking up the challenge of changing how data science is taught and practiced and building a whole new field of data for social impact. Our goal over the next decade is to train a million purpose-driven data practitioners from impact data scientists to impact data analysts, impact engineers, impact data stewards, and impact data ethicists.

This goal is informed directly by our research in the recent Workforce Wanted report that established the scale of need—and, commensurately, the scale of opportunity—in the area of data talent in the social impact sector.

In doing this, we are focusing on diversity from all angles including but not limited to gender, race, and nationality, and we are working with partners to build centers of excellence of impact data science across the world, especially in lower- and middle-income countries.

We are thinking big because solving some of the greatest systemic issues we face such as climate and inequality, will require broad engagement. Our bet is that a million brains can do what one alone cannot.

About the Author