Pathways to Impact is a series of conversations with data for social impact leaders exploring their career journeys. Perry Hewitt, CMO of data.org, spoke with Roberta Evangelista, Sustainability Data Science & Digitalisation Specialist at Basel Agency for Sustainable Energy (BASE), about ways to build ones data science knowledge base to benefit their career growth in the field.
What role or sector did you move from — and how?
I come to DSSI from the theoretical end of the spectrum. I studied mathematics for my bachelor’s degree, and I loved it. But I realized quite early in my career that I wanted to do something more applied. Rather than doing abstract math and creating new theorems, I was eager to apply math to real-world problems. This tied in with a longstanding interest of mine in how people make decisions, and how this decision-making connects to biology.
I moved first from math into the field of computational neuroscience, which basically sits at the intersection of math, computer science, and neuroscience. My master’s and Ph.D. were in computational neuroscience, and very specifically research into memory processes. We were studying how we can remember episodes of our lives, even for a very long time.
Wait, what’s the answer to that!
Really, it’s all about good sleep! That’s really important to consolidate past memories. All these processes actually happen during sleep — not in the dreaming part, but in the non-dreaming part.
Neuroscience is a fascinating field, but it’s also not so applied in the sense that the research takes a long time to be proven or unproven. To do the science well, you need to wait a very long time to find out if the models that you created are right or wrong. All this waiting didn’t appeal to me. At the same time, I developed an interest in applying data science to all sorts of topics. What I was doing in computational neuroscience was already data science because I worked with large quantities of data. I had to analyze the data and create models out of that, and I found it very fascinating that you can use practically the same models for very different topics.
From neuroscience, I moved into a private-sector education role, working as a data scientist for a couple of years here in Zurich. During that time, I volunteered with DataKind for about a year, which gave me my first experience in the development sector – and I absolutely loved it. And then I took this position at BASE – my route to data science in the social sector has not been a very straight line!
I think it’s important to build a solid foundation as a data science practitioner. It's really essential that you exchange ideas with other data scientists so that you can build up your expertise before you try to go into an NGO or a smaller organization where you might be the only data scientist.Roberta Evangelista Sustainability Data Science & Digitalisation Specialist Basel Agency for Sustainable Energy (BASE)
How did you hear about the BASE job? What are you working on today?
I first learned about the BASE role on a mailing list about social impact jobs here in Switzerland. At that time, I was working as a volunteer with DataKind, and heard that other volunteers were reviewing the technical proposals for the Inclusive Growth and Recovery Challenge – which included BASE.
For those who don’t know, BASE is an not-for-profit organization which for more than 20 years has been developing business models and financial mechanisms to drive investment in climate change solutions. Thanks to the data.org Challenge, we have now moved more into data and data science, bringing together agriculture and climate change solutions. That pivot toward innovative use of data is why I joined BASE.
I work as a data scientist and technical lead of a project called Your Virtual Cold Chain Assistant (VCCA). Your VCCA helps smallholder farmers to get access to sustainable cooling. We partner with local organizations who own, maintain, and operate cold storage, that is, refrigerated containers powered by solar.
We’re trying to drive change with Your VCCA in two ways:
- One is a pay-per-use business model, that enables farmers to bring their produce in cold rooms instead of storing it outside, without an upfront investment of buying the fridge or the container. They can just bring whatever they have harvested and store it there for the number of days they want, and for a fixed price per day and per crate.
- The other piece of the equation is a mobile application, which helps service providers offering cold storage manage the room digitally. Previously, they used a register where they manually entered all the information about the farmers who came in, what they brought, etc. As you can imagine, it’s very hard to monitor, especially if you’re not physically at the cold room, and if the cold room is in a place that it’s not very accessible.
Data science plays an important role here. First, there are machine learning models that help predict how many days the produce is going to be good if it were outside, versus inside the cold room. This helps people understand the benefit of cold storage. And the second component is informing cold room users of market prices around the area. Knowing both the remaining life of the produce and the forecast market prices, smallholder farmers can make informed decisions about when and where to sell.
This work addresses two important social problems. The first one is that a huge amount of food that is produced and harvested gets spoiled, and one of the main reasons is the lack of cold storage. I didn’t know about this before starting this project, but the figures are something like 30% to 40% of food is spoiled post-harvest, with peaks of up to 70%. The number sounds crazy to me. But apparently for some crops, and in some seasons, it can be true.
The second social problem is that farmers are often forced to sell at a very low price because they have no alternatives. This project really tries to tackle both aspects — reducing food loss by providing cold storage, and also increasing the income of the farmers by providing them with new information about when and where to sell.
You’ve described a path from math to a neuroscience Ph.D. to a private-sector job to volunteering with DataKind to BASE. Was there anything that you feel blocked your career entry or your progression in this field?
It sounds more complicated than it was. I felt it was kind of natural to go in the direction that I liked at the moment, moving on as I found out about something that was more interesting or more useful. I’ve been fortunate, but a blocker for those wanting to go into data science for social impact is definitely the lack of jobs in this sector.
Today there is more and more available data, even in parts of the world where, historically, there hasn’t been a very strong open data policy. However, there are not as many projects that leverage this data, especially in the social sector. There might be a few positions where you are a data analyst or a data scientist, but sometimes there isn’t a clear data science-able problem or defined project.
While working in the private sector, I learned that I really like to work on a defined project with a product, with something where you can engage with users, and hear their feedback to build a strong solution. I wouldn’t like to just do analysis of some data for the sake of it, or maybe in the hope that one day we can use the knowledge for something. And so the job that I found at BASE was very special in that there was a defined project, and there was a clear product, the mobile application for cold storage. I know many data scientists who would like to go into the DSSI sector. But one of the hardest things is to actually find this kind of position. It’s getting better, but I think there is still a long way to go.
What community of people or resources would you say bolsters your work? What community do you engage with or rely on?
Two groups stand out: the first is a community of data scientists who come from an academic background, as I do. We did a bootcamp in London together a few years back. It’s called Science to Data Science, and it’s a really amazing environment because you get to know a lot of data scientists who come from academia.
The group shares perspectives on why they ended up in academia, and why they now want to enter a new field — the motivations are usually quite similar. It’s a very supportive network. We have regular meetings, and we have a chat where, if you have a technical problem, someone can always help. It’s the first place I would write to with an issue.
Also, it’s nice that we all started basically the data science journey in a similar way. It’s inspiring to see how people’s careers are evolving over the years. People in data science change jobs quickly. In this group, on average, I would say that we all have changed jobs at least once. It’s very interesting to see how career pathways emerge.
The second community that I rely a lot on is DataKind. It has changed the way I think about data science problems. Working in the development sector, you get exposed to complexities that are linked to data science which normal data scientists working in a private sector might not notice. For example, when designing our app we have to consider issues related to accessibility like internet connectivity, road connectivity, literacy levels — as well as how we can build models when little data is available.
And these considerations are rare in the private sector big data world. A data scientist coming from the private sector would tend to overlook these. But having worked with DataKind people and projects, you develop an eye for this, and you learn to pay close attention to those topics. That’s definitely important.
Women are still underrepresented in data science. Are there any gender-based groups you belong to?
There are two. One is Women Who Code, which is not for data scientists, but it’s more for women in any sort of tech job. There have been only virtual gatherings over COVID time, including very useful seminar series, which has been nice, but I get a bit annoyed by that right now.
And the other community is the Women in Data Science group, which is a bit academic as many people come from, or are currently working in, academia. But there is also a growing community of professionals from the industry side, which I think can trigger interesting exchanges.
Here’s what I am missing: a group of professionals working in the development sector who are data scientists. Not data analysts, but focused exclusively on machine learning techniques for the development sector. And this is something I haven’t found, virtually or in person.
It’s clear that your specialized degree, your trajectory from math to neuroscience to data science, represents your primary contribution to the sector. But is there a non-data science skill set that offers the greatest return on your work?
What helped me a lot is the fact that I’ve been educated from very early on, already when I was transitioning to neuroscience, that the world is very interdisciplinary. What this means in practical terms is that you really have to learn to communicate with people who have a very different background from yours, and also with people who might be maybe technically very strong, but on another side of the technical world.
To succeed in this environment, you have to be open to asking stupid questions. Maybe they are experts in computer science, but you’re not. In turn, you also learn to expect stupid questions. This means that you cannot really take anything for granted. You have to make sure that whenever you’re explaining something, you are providing the relevant background and context. This teaches you to boil down a complex problem, or a complex model that you have designed, in simple steps so that people who have not worked on this model can understand.
This is an important skill, it’s especially relevant now because we are working right at the intersection of different fields. There are people who are experts in the modeling of, say, fruits and vegetables, some who are experts in the climate part, and others who are experts in the engineering of how the cold room functions — with the sensors and everything. And we have to bring everybody together so that we can design the app in the right way. The ability to work effectively with people from different backgrounds is a skill set that provides huge benefits.
What advice do you have for someone new to the field, but interested in doing this work?
I think it’s important to build a solid foundation as a data science practitioner. My advice would be to first look for a mid-sized organization, where there are already other data scientists to learn from. It’s really essential that you exchange ideas with other data scientists so that you can build up your expertise before you try to go into an NGO or a smaller organization where you might be the only data scientist.
This might mean you maybe spend a couple of years not in the most exciting field, but you develop a lot of skills which you know will be useful later on. It can be tempting to jump straight into cool DSSI applications, but I think this strategy pays off in the long run. You build a better data science foundation to make a bigger impact when you come to DSSI.
And the other advice I would offer is to be aware of the fact that data science is a fast-moving field, especially the link to the data engineering part, and everything that concerns the deployment and maintenance of the models into production. It’s a field where the technologies evolve very quickly, so you have to be open and willing, to always stay trained, which might mean volunteering, doing online courses, going to meetups.
What’s the next big thing in DSSI that you see?
I’m personally very excited by the potential for geospatial data and satellite imagery. I think that’s really something that has gained a lot of attention in recent years because a lot of satellite images have been made publicly available for the first time. And satellite imagery has become more and more precise both in the spatial and temporal dimensions.
What we can do with this data has also grown incredibly. While it has not fully developed yet, I see a lot of potential in applications like monitoring deforestation, or anything that’s related to index-based insurance for agriculture. How can you monitor conditions, especially in regions that are hard to access, and how can you make the most out of this data? How can you apply machine learning to this data?
Related to that, I see so much potential in visualizations that rely on spatial data, especially for a non-technical audience. My experience so far is that stakeholders get more engaged if you can present data on a map, instead of showing them graphs or tabular data.
We recently developed an interactive map of India, where we’re displaying different types of data coming together from various sources such as roads, connectivity data, market and census data. The feedback that we’ve gotten is that it’s a great benefit to visualize in one place data sources that otherwise would have stayed very compartmentalized.
What’s your don’t-miss daily or weekly read?
On a daily basis, I follow LinkedIn. It took me a while to get into the right spaces and the right connections, but now there is a lot of relevant content that I look at there.
More on a weekly or bi-weekly basis, I like to listen to the 80,000 Hours podcast. I find it very inspiring because it mixes different perspectives related to how to make a positive impact in the world. And some of these conversations can get quite technical, covering topics in data science, ethics in AI, and current research in DSSI.
I very much appreciate their openness in discussing the reality that working with data can have a huge positive impact, but that we must also watch out for certain pitfalls. For example, we need to ensure we are creating applications that are inclusive, bringing in the voices and perspectives of the communities we serve. We also have to check the completeness and the reliability of the data sources we are using. It’s essential that we reflect on these potential problems, to ensure that we hold ourselves accountable for using the best data science approaches to drive meaningful social impact.