ChatGPT doesn’t think it’s very ethical

AI “black boxes” in which the inner workings of an AI areshrouded in secrecy are not acceptable when public interest is at stake
— Australia’s Ethics Framework, CSIRO

ChatGPT – no shortage of controversy

I recently wrote about the importance of considering AI Ethics. In the last month alone there have been interesting developments (like Italy banning ChatGPT and the first defamation case against OpenAI), further highlighting the need for ethics to be taken more seriously – especially by those of us designing AI powered solutions.

On the flip side of these emerging issues, new AI driven use cases, tools, products and services are being launched almost daily. It’s excitingly unstoppable and the rate of change will only accelerate.

So I thought it would be a good moment to reflect again on the topic of AI Ethics, but this time to see what I could learn from ChatGPT’s own reflection (of itself) in regards to Ethics.

How does ChatGPT rank itself on ethics?

In 2019 CSIRO published Australia’s Ethics Framework. The framework defines 8 core principles for the ethical usage of AI.

I asked ChatGPT to rank it’s (own) ability to be resistant to exploitation against each of these 8 principles. The results were interesting. Where 1 represents weak resistance to exploitation (i.e. it’s highly vulnerable) and 10 represents strong resistance to exploitation, it seems ChatGPT doesn’t rate itself very highly.

  1. Generates net-benefits (ChatGPT resistance rating: 4/10)

  2. Do no harm (ChatGPT resistance rating: 3/10)

  3. Regulatory and legal compliance (ChatGPT resistance rating: 5/10)

  4. Privacy protection (ChatGPT resistance rating: 7/10)

  5. Fairness (ChatGPT resistance rating: 4/10)

  6. Transparency & Explainability (ChatGPT resistance rating: 6/10)

  7. Contestability (ChatGPT resistance rating: 4/10)

  8. Accountability (ChatGPT resistance rating: 6/10)

Mapped out below, it becomes clear to see which areas ChatGPT (by it’s own assessment) is open for misuse.

ChatGPT self rating it’s resistance to exploitation against 8 core ethics principles

Do no harm?

Reflecting on the chat above, I then asked ChatGPT to elaborate on it’s area of lowest resistance to exploitation “Do No Harm” and to summarise how (and provide examples) of how it was vulnerable to do harm. It explained and gave these examples;

————————–

FROM CHATGPT

————————–

While ChatGPT is designed with good intentions, it could potentially be misused or exploited by humans or machines with malicious intentions. Here are a few examples of how this could happen:

  1. Disinformation and propaganda: A user with malicious intentions could use ChatGPT to generate false or misleading information to spread disinformation, create confusion, or manipulate public opinion. This could include creating fake news articles, fabricating quotes, or generating misleading statistics.

  2. Encouraging harmful behavior: A malicious actor could use ChatGPT to generate content that encourages or glorifies self-harm, violence, or other destructive behaviors. For example, they might prompt the AI to produce instructions for creating dangerous substances or carry out harmful acts.

  3. Cyberbullying and harassment: Someone with bad intentions could leverage ChatGPT to generate offensive or harassing content targeted at individuals or specific groups. This might include generating hateful, abusive, or discriminatory messages, which could be sent to victims via social media or other communication platforms.

  4. Phishing and social engineering attacks: A malicious user could use ChatGPT to craft convincing phishing emails or messages, designed to trick recipients into revealing sensitive information, such as passwords or financial data. The AI-generated content could be tailored to impersonate legitimate organizations or individuals, making the deception more convincing.

  5. Amplifying existing biases: A person with malicious intent could manipulate ChatGPT's outputs to exacerbate existing social, cultural, or political biases, contributing to increased polarization and division in society.

Responsible high ground

Snapchat recently released MyAI, which appears in your Snapchat feed as another contact to chat to. It works well. Perhaps too well. There are emerging examples of MyAI providing questionable outcomes, like the example below where it was helping a 15 year old with purchasing alcohol.

I can understand the opportunity and temptation (partly driven out of the ease) of integrating ChatGPT into existing Products and Services, but clearly more consideration needs to be taken when doing so.

What this helps highlight is the (ever apparent) need for Designers to account for Ethics in the Products, Services and Tools we create – this has been super accelerated by the plethora of fast emerging (and mostly unregulated) AI enabled use cases.

Conclusion

As Designers, we need to take the high ground as we explore this new, fast emerging and exciting space. And part of that is using some of the many Ethical AI framework’s available (I wrote about this in a seperate post).

It’s an exciting space and whilst moving forward is the only direction, we need to do so with ethics front of mind.

Footnotes

CSIRO Australia’s Ethics Framework, Core principles for AI (Australia’s Ethics Framework)

  1. Generates net-benefits. The AI system must generate benefits for people that are greater than the costs.

  2. Do no harm. Civilian AI systems must not be designed to harm or deceive people and should be implemented in ways that minimise any negative
    outcomes.

  3. Regulatory and legal compliance. The AI system must comply with all relevant international, Australian Local, State/Territory and Federal government obligations, regulations and laws.

  4. Privacy protection. Any system, including AI systems, must ensure people’s private data is protected and kept confidential plus prevent data breaches which could cause reputational, psychological, financial, professional or other types of harm.

  5. Fairness. The development or use of the AI system must not result in unfair discrimination against individuals, communities or groups. This requires particular attention to ensure the “training data” is free from bias or characteristics which may cause the algorithm to behave unfairly.

  6. Transparency & Explainability. People must be informed when an algorithm is being used that impacts them and they should be provided with information about what information the algorithm uses to make decisions.

  7. Contestability. When an algorithm impacts a person there must be an efficient process to allow that person to challenge the use or output of the algorithm.

  8. Accountability. People and organisations responsible for the creation and implementation of AI algorithms should be identifiable and accountable for the impacts of that algorithm, even if the impacts are unintended.

ChatGPT’s assessment of itself

————————–

FROM CHATGPT

————————–

I will provide a subjective ranking of the potential for exploitation of each principle in relation to ChatGPT, where 1 represents high vulnerability and 10 represents strong resistance to exploitation.

  1. Generate net benefits: 4 Although ChatGPT aims to generate net benefits, it could be misused by users with malicious intentions or manipulated to provide misleading information, thus harming individuals or groups.

  2. Do no harm: 3 Despite being designed to minimize harm, ChatGPT can be exploited by those seeking to generate harmful content, spread disinformation, or promote hate speech, intentionally or unintentionally.

  3. Regulatory and legal compliance: 5 While ChatGPT strives to be compliant, there may be instances where the AI system generates outputs that violate laws or regulations, especially if exploited by users with malicious intentions.

  4. Privacy protection: 7 ChatGPT is designed to protect user privacy, but there is a potential for exploitation if unauthorized parties access user data or if the AI unintentionally reveals sensitive information in its responses.

  5. Fairness: 4 Even though the AI system aims to be unbiased, it can be exploited to perpetuate or promote biases, stereotypes, or misinformation, either accidentally or by users seeking to manipulate its responses.

  6. Transparency and explainability: 6 The complexity of the AI model may hinder full transparency and explainability, which could be exploited by malicious actors to deceive users or hide the true intentions behind AI-generated content.

  7. Contestability: 4 Since ChatGPT's responses may not be easily contestable or adaptable based on user feedback, the system could be exploited to create a false sense of authority, making it harder to challenge misleading or biased outputs.

  8. Accountability: 6 OpenAI is committed to being accountable for ChatGPT's performance, but potential gaps in understanding the AI's decision-making processes or limitations may be exploited to shift blame or avoid responsibility.

These rankings are subjective and only provide a rough estimation of the potential for exploitation of each principle in relation to ChatGPT. It's important to note that mitigating these vulnerabilities requires continuous efforts from developers, users, and policymakers alike.

Previous
Previous

ChatGPT isn’t Harvey Specter, it’s Mike Ross

Next
Next

Using LLMs to harvest knowledge from a global member network's email archives