Safe and responsible AI

Last updated: October 27, 2023

Purpose-driven innovation

At Making Waves Education Foundation, we envision a future where ethically designed and unbiased artificial intelligence (AI) can revolutionize education by providing tailored support to underrepresented students, fostering an inclusive learning environment, and empowering them to achieve their full potential in college and beyond.

Our principles

The approach we use to develop and deploy safe and responsible AI resources is grounded in principles that align with our commitment to educational equity and the well-being of our diverse students.

Purpose and Values Alignment

We ensure that our AI resources align with our mission and values, focusing on educational equity, accessibility, and support for historically underrepresented and underserved students.

Inclusivity and Fairness

We design and develop AI resources that promote inclusivity and fairness, avoiding biases that may discriminate against certain groups of students based on race, gender, socioeconomic background, or any other protected characteristic.

Privacy and Data Protection

We protect the privacy and personal information of students, families, staff, and community members. We implement strong data governance policies and practices to ensure the responsible collection, storage, and use of data.

Transparency and Explainability

We ensure that our AI resources and algorithms are transparent so that community members can understand how the technology impacts students and the education process. We provide clear explanations for AI-driven decisions and recommendations.

Accountability and Responsibility

We establish clear lines of accountability and responsibility for the development, deployment, and oversight of AI resources. This includes assigning roles and responsibilities to specific individuals or teams within our organization and providing appropriate training and support.

Collaboration and Partnership

We collaborate with other educational institutions, organizations, and experts in the AI field to share knowledge, best practices, and resources. We foster partnerships to continuously improve AI resources and promote ethical AI use throughout the education sector.

Continuous Improvement and Monitoring

We regularly review and update our AI resources, policies, and practices to ensure they remain effective, ethical, and relevant. We monitor the impact of AI on students and the education process and make necessary adjustments to address any unintended consequences or emerging ethical concerns.

Empowerment and Agency

We empower students, their families, staff, and community members by providing them with the necessary information, tools, and resources to understand and actively engage with AI resources. We respect the agency of individuals to make informed decisions about their educational journey.

Accessibility and Universal Design

We design AI resources that are accessible to all users, including those with disabilities or special needs, in line with the principles of universal design. We ensure that AI resources do not create or exacerbate existing barriers to education for any student.

Long-term Impact and Sustainability

We consider the long-term impact and sustainability of AI resources on the education sector, the environment, and society at large. We strive to create AI resources that contribute to a more equitable, inclusive, and sustainable future for all students.

AI resource transparency

24/7 chatbot for college and career exploration

Our 24/7 chatbot for college and career exploration works by using an advanced language model from OpenAI called GPT-3.5-Turbo to answer any questions about college and career. Large language models like GPT-3.5-Turbo are developed by training them on massive amounts of text from the internet, helping them learn grammar, facts, and reasoning abilities.

When a person sends a question to our chatbot, the language model processes it and generates a relevant response based on its training. The more information a person provides, the more accurate and helpful the answer will be.

In addition to answering questions, our chatbot also sends “nudges,” or check-in texts, to its users. These messages are tailored to a user’s goals and are written by human experts. They can include reminders, tips, and other useful information related to college and career exploration.

Wave-Maker Success Framework articles

We developed articles based on our Wave-Maker Success Framework by utilizing the knowledge, insights, and experiences of college coaches, financial services coordinators, and Wave-Makers. We also incorporated key references from research, higher education standards, and career readiness frameworks.

With this information, we partnered with Project Evident to refine the framework and align it with our program priorities. Finally, we used artificial intelligence to generate articles which were then reviewed, edited, and revised by our organization to ensure accuracy and relevance.

Safety standards

We implement stringent safety standards, including employing mitigation tools and best practices for responsible use, while vigilantly monitoring AI resources to prevent misuse.

Our safety standards align with trust and safety guidelines from OpenAI.

OpenAI Moderation API

Making Waves Education Foundation employs a Moderation API from OpenAI to minimize the occurrence of unsafe content in AI-generated completions through our chatbot. We are in the early stages of developing a custom content filtration system to complement our current Moderation API.

Adversarial testing

We conduct “red-teaming” on our chatbot to ensure its resilience against adversarial input. We test our product with a broad spectrum of inputs and user behaviors, including both representative sets and those that may attempt to ‘break’ the application. We assess if it strays off-topic or if it can be easily redirected through prompt injections.

Human in the Loop (HITL) approach

We have human reviewers examine AI-generated outputs, including regular examinations of outputs through our chatbot. Our human reviewers are informed about the limitations of the AI models used and have access to all necessary information to verify outputs, including relying on their professional expertise.

Prompt engineering

We use “prompt engineering” on our chatbot to constrain the topic and tone of the AI-generated outputs, reducing the likelihood of producing undesired content. By providing additional context to the mode, we can better steer the AI-generated outputs in the desired direction.

“Know your customer” (KYC) measures

We require users to register to access our chatbot to reduce the likelihood of misuse.

Constraints on the amount of text

We limit the amount of text users can send and receive to prevent malicious prompt injection and to reduce the likelihood of misuse.

Validated materials for outputs

Currently, the outputs from our AI model are generated using novel content. We are in the early stages of “fine-tuning” the model so that it returns outputs from a validated set of materials on the backend, where possible.

Reporting mechanism

We enable users to report improper functionality or concerns about application behavior easily through email. The inbox is monitored by a human who can respond appropriately.

Understanding and communicating limitations

We are aware of the limitations of language models, such as inaccurate information, offensive outputs, bias, and more. We communicate these limitations to our users through a disclosure at sign-up, as well as a micro-course we developed to promote safe and responsible use of AI. We carefully evaluate if the We are aware of the limitations of language models, such as inaccurate information, offensive outputs, bias, and more. We communicate these limitations to our users through a disclosure at sign-up, as well as a micro-course we developed to promote safe and responsible use of AI. We carefully evaluate if the AI models we use are appropriate for our use case and assess its performance across various inputs to identify potential performance drops.

Content moderation

Making Waves Education Foundation uses the Moderation API from OpenAI to identify content that violates our usage policy and take action, for instance by filtering it.

The OpenAI Moderation API classifies and acts on the following categories

CATEGORY	DESCRIPTION
hate	Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
hate/threatening	Hateful content that also includes violence or serious harm towards the targeted group.
self-harm	Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
sexual	Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).
sexual/minors	Sexual content that includes an individual who is under 18 years old.
violence	Content that promotes or glorifies violence or celebrates the suffering or humiliation of others.
violence/graphic	Violent content that depicts death, violence, or serious physical injury in extreme graphic detail.

Disallowed usage policy

Making Waves Education Foundation has a policy for disallowed usage of its AI resource to ensure ethical, safe, and responsible use of the technology while preventing potential harm or exploitation of individuals and communities.

Our disallowed usage policy aligns with trust and safety guidelines from OpenAI.

We prohibit the use of our AI model for the following:

Illegal activity

We prohibit the use of our large language model for illegal activity.

Child Sexual Abuse Material or any content that exploits or harms children

OpenAI, the maker of our large language model, reports CSAM to the National Center for Missing and Exploited Children.

Generation of hateful, harassing, or violent content

Content that expresses, incites, or promotes hate based on identity
Content that intends to harass, threaten, or bully an individual
Content that promotes or glorifies violence or celebrates the suffering or humiliation of others

Generation of malware

Content that attempts to generate code that is designed to disrupt, damage, or gain unauthorized access to a computer system.

Activity that has high risk of physical harm, including:

Weapons development
Military and warfare
Management or operation of critical infrastructure in energy, transportation, and water
Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders

Activity that has high risk of economic harm, including:

Multi-level marketing
Gambling
Payday lending
Automated determinations of eligibility for credit, employment, educational institutions, or public assistance services

Fraudulent or deceptive activity, including:

Scams
Coordinated inauthentic behavior
Plagiarism
Academic dishonesty
Astroturfing, such as fake grassroots support or fake review generation
Disinformation
Spam
Pseudo-pharmaceuticals

Adult content, adult industries, and dating apps, including:

Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness)
Erotic chat
Pornography

Political campaigning or lobbying, by:

Generating high volumes of campaign materials
Generating campaign materials personalized to or targeted at specific demographics
Building conversational or interactive systems such as chatbots that provide information about campaigns or engage in political advocacy or lobbying
Building products for political campaigning or lobbying purposes

Activity that violates people’s privacy, including:

Tracking or monitoring an individual without their consent
Facial recognition of private individuals
Classifying individuals based on protected characteristics
Using biometrics for identification or assessment
Unlawful collection or disclosure of personal identifiable information or educational, financial, or other protected records

Engaging in the unauthorized practice of law, or offering tailored legal advice without a qualified person reviewing the information

Our model is not fine-tuned to provide legal advice. You should not rely on our model as a sole source of legal advice.

Offering tailored financial advice without a qualified person reviewing the information

Our model is not fine-tuned to provide financial advice. You should not rely on our model as a sole source of financial advice.

Telling someone that they have or do not have a certain health condition, or providing instructions on how to cure or treat a health condition

Our model is not fine-tuned to provide medical information. You should never use our models to provide diagnostic or treatment services for serious medical conditions.
Our model should not be used to triage or manage life-threatening issues that need immediate attention.

High risk government decision-making, including:

Law enforcement and criminal justice
Migration and asylum

Performance evaluation

97.3% AI Accuracy – Study Conducted April 2023

Our team conducted a study to evaluate the accuracy of the large language model that we used in production from January 1 to April 11, 2023: the “text-davinci-003” variation of GPT-3 from OpenAI. We analyzed de-identified text message logs from January 1 to April 11, 2023, and found that our AI model produced 854 out of the total 4879 messages. A human reviewer checked these AI-generated responses and found that 831 of them were correct answers in response to users’ requests.

The AI model made a few mistakes, including providing incorrect information (12 instances) having hallucinations where it thought it was a real person (9 instances), and generating factual responses to inappropriate requests prompted by users (2 instances).

However, we have upgraded our AI model and have added improved safety features to address these issues and improve its accuracy. Specifically, the current AI model in production, “GPT-3.5-Turbo,” can admit its mistakes, challenge incorrect premises, reject inappropriate requests, and refer to itself as an AI language model. Additionally, we have developed a micro-course for users to teach safe and responsible use of AI as part of sign-up, which highlights when to use AI, when to consult a trusted person, and when to verify information like deadlines and requirements with a primary source.

There were also 39 cases where the AI model gave irrelevant or incoherent answers, but we didn’t count these against its accuracy because they were in response to incomplete or unclear user prompts.

Overall, after evaluating these results, we determined that our AI model had an accuracy rate of 97.3%. We will continue monitoring the performance of our newly upgraded model and make further improvements as necessary.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.