Guide Objectives
- Understand how biases in data may affect Gen AI responses
- Learn how to mitigate ethical and organizational risks in using Gen AI
- Learn how to use Gen AI to support program ideas, not create them
Gen AI tools, such as ChatGPT, are rapidly redefining what is possible and the speed at which information is curated and communicated.
While these tools have the potential to profoundly impact efficiency and effectiveness at work, there are also risks to consider when utilizing them for designing development programs. Concerns around community misrepresentation and biases in the data are important to acknowledge when using Gen AI tools in designing programs.
This guide aims to contextualize these concerns and offer practical steps organizations can take to mitigate associated risks.
Guide Use Case
To support the learnings within this guide, we will be considering the fictitious curriculum development tech organization, Tech 4 Elevated Education (TEE), which is beginning use of an education-based Gen AI tool, AI4Ed. AI4Ed develops its model to analyze a dataset of youth demographic and educational data based on user requests.
Guide Specific Disclaimer
Gen AI technology and its applications in every day life are rapidly evolving. As such, best practices are being constantly updated. When using Gen AI, you should remain attentive to recent updates and news about the platforms you are using, along with their inherent risks. While this guide is based on common principles of utilizing Gen AI, you should stay attuned to the specific nature and best practices of the platforms and tools you use.
Understand how Gen AI can be used when designing development programs
Development programs are designed to address pressing social concerns or needs. These can be related to health, education, poverty reduction, and climate change, to mention a few. Development professionals recognize the variety of approaches available for addressing these needs, each shaping their programs differently, reaching different outcomes.
For example, if you are looking to improve maternal and newborn child health in a certain region, your program could involve equipping local clinics, incentivizing skilled hospital staff to support remote areas, or running an education program for young mothers, among other approaches.
Used under the right conditions, Gen AI tools can be valuable in enhancing design thinking for development programs. For instance, when working on a proposal, Gen AI can assist in brainstorming creative solutions that you may have overlooked otherwise.
While a Gen AI tool can be used as a great sparring partner for program design, it should only be used in this way when the person or team involved has prior knowledge or experience with the topic being discussed. An understanding of the topic is essential for filtering out content and assessing the relevance and validity of the responses provided by the tool.
Guide Use Case
TEE has been prompting AI4Ed, its Gen AI tool, to analyze discrepancies in educational performance of special needs youth in various regions with those of other youth in those areas. This initial analysis allows a TEE team that is experienced with special needs curriculum considerations to quickly identify regions of greatest challenges for further exploration.
A few ways in which Gen AI tools may support you in program design by:
- Delivering general content on the community landscape
- Summarizing the political climate in a region
- Describing overall trends in key quality of life indicators (education, income, health, housing, etc.)
When considering this support, it is important to note that Gen AI is best used in providing general, rather than detailed, contextual understanding. You can see examples of these use cases in the ChatGPT – Sample Prompts & Responses document.
Understand the risks of using Gen AI in designing development programs
Using Gen AI as a sparring partner is exciting and may be efficient, but it can come with risks. Due to potential data biases, using Gen AI in such a way could do more harm than good in designing social programs. In these situations, data biases likely come from a lack of or misleading content within .
Lack of Data: Development programs are typically in support of marginalized communities. Within these communities, there is often a scarcity of data generated, especially within low-income regions of the world without the infrastructure of more affluent areas.
Given this data scarcity in certain communities, the ability for a Gen AI tool to be trained on their data and then provide a response that is representative of the region, its cultures, its priorities, and its people is a potential problem. Essentially, content gaps within the data can lead to inaccurate AI responses, potentially resulting in negative impacts when designing programs.
Misleading Data: The second type of bias is when a Gen AI tool has been trained on data that represents pre-existing societal biases and inequalities. By training on such datasets, Gen AI tools may produce responses that perpetuate stereotypes, rather than reflecting the actualities of community life.
It is important to note that significant safeguards have been established by developers of Gen AI tools to prevent these sources of data bias from influencing responses. However, despite these precautions, there remains an inherent risk of bias, particularly within development programs.
In this step, it is important to understand the risks of misrepresentation and bias in the Gen AI tools you are using to design a development program. Before using a tool for this purpose, we recommend asking these questions:
- How will I use Gen AI tools to help design my program(s)?
- What types of flaws might exist in the information used by the tool to develop responses? (Too little data? Biased internet content?)
- What types of negative impact from my work may come from misrepresentative or inaccurate Gen AI responses?
Guide Use Case
Prior to prompting AI4Ed to analyze youth educational performance in a series of regions, the TEE team had developed and documented a list of regions not to analyze with AI4ED, as they were aware research and data available in those regions is sparse. This reduced the risk of misleading or biased results.
Common Gen AI response biases that occur include:
- Stereotype biases – occurs when models adjust to existing perceptions and stereotypes that are present in the training data
- Racial bias – occurs when models refine their algorithms based on training data using racially biased views
- Cultural bias – occurs when models adjust to generate unfair treatment and flawed outputs toward particular cultures and nationalities
- Gender bias – occurs when training data considered favor various genders for certain jobs, responsibilities, and other roles
Utilize best practices to mitigate biases
Once you have an understanding of the potential data biases and risks related to a Gen AI tool, you can begin to use it responsibly for your program design. Here are some best practices we recommend:
- Evaluate the content: Always evaluate the responses that a Gen AI tool provides. Use your own knowledge of the subject to determine whether the responses are valid. You can also consult subject matter experts to evaluate the accuracy and appropriateness of Gen AI responses within specific contexts.
- Cross-Validation: Use multiple Gen AI models or techniques to generate responses and cross-validate their outputs. Comparing results from different models can help identify inconsistencies and mitigate the risk of relying on biased or faulty conclusions.
- Adjust your prompts: Pay close attention to the types of prompts that generate the best answers. For instance, you can send a prompt, see a response, and then further refine what you are looking for in an adjusted version of the prompt. Generally, we recommend very descriptive prompts when using Gen AI.
- Fact check with ground truth data: Use ground truth data to evaluate the accuracy and reliability of generative AI responses. Even when using Gen AI tools that provide references, it is important to verify those references. For example, ChatGPT has been known to generate inaccurate references in the past. Compare AI-generated conclusions against known facts to identify discrepancies and adjust accordingly.
Guide Use Case
After generating the educational performance discrepancy analysis from AI4Ed, the TEE team implements its review process with curriculum experts in various regions to assess the results and ensure they are relevant and checked. The team also compares the results with pre-existing data they have on associated topics to ensure the analysis is in line. This mitigates risk of using inaccurate prompt results.
‘So what’ and next steps
Keeping these use cases, risks, and best practices in mind, it is important to remember that Gen AI is a tool, not a one-stop-shop. As with any tool, it takes time to learn how to use it effectively, safely, and appropriately.
You and your teams should use Gen AI tools to support your decision-making only when combined with prior knowledge and additional analyses. These tools are meant to augment—not replace—your team’s creativity and expertise in designing programs, emphasizing a human-centered approach.
Was this guide helpful? Please rate this guide and share any additional feedback on how we might improve it.