IBM VEST Workshops

AI Risk Atlas

With the advent and exponential growth of AI, specifically generative AI through the use of LLMs (Large Language Models), associated risks have also grown at similar pace. Organizations basing decisions on unfair or biased results can face audits, fines, costly litigation and reputational damage.

It has become more important than ever for organizations to take a more proactive approach when it comes to detecting and mitigating risks, monitoring for fairness, bias and drift, and tracking new generative LLM metrics as the field continues to expand.

Below is an example of a few of the AI risks split into 3 common areas and further categorizad into a by Traditional AI risks (applies to traditional models as well as generative AI), Risks amplified by generative AI (might also apply to traditional models), and New risks specifically associated with generative AI. A full detailed list of the the various risk along with the associated explanation and examples can be found here on the AI Risk Atlas page

Risks associated with Input

Data bias

The biases present in historical data can lead to unfair treatment of certain groups in the model's predictions. These biases can result from societal prejudices and historical injustices, which are often reflected in the data used to train AI models. Addressing these biases requires careful consideration of the data sources and the development of models that are robust to potential sources of bias.

Prompt injection risk for AI

Prompt injection attacks are a type of security exploit where an attacker manipulates the input prompt provided to a model to cause it to generate incorrect or unexpected outputs. These attacks can take advantage of the model's structure, biases, or learned patterns, and can be particularly dangerous in models used for safety-critical applications. To mitigate the risk of prompt injection attacks, it is essential to implement proper input validation and sanitization techniques, as well as to ensure that the model's training data and prompts are thoroughly inspected and secured.

Jailbreaking risk for AI

Jailbreaking in AI and LLMs refers to an assault that goes beyond the boundaries set for the model, attempting to manipulate or exploit its capabilities beyond its intended design. This unauthorized action can lead to unintended consequences and potentially compromise the model's integrity and security.

Risks associated with Output

To avoid potential copyright infringement, it is crucial for content generation models to produce unique and original content. This ensures that the generated material does not resemble existing works too closely, thus respecting the intellectual property rights of creators and avoiding legal repercussions. By adhering to this principle, model developers can help maintain the integrity of the content creation process and promote innovation in the field.

Hallucination risk for AI

LLMs can generate false or misleading information, which can have serious consequences. This can include causing harm to individuals, damaging reputations, and even leading to legal repercussions. It's crucial for developers and users of LLMs to ensure that the content generated is accurate and reliable.

Spreading disinformation risk for AI

Malicious actors can exploit AI-powered text generation models to create and disseminate misleading or false information, known as deepfakes. These sophisticated forgeries can be designed to impersonate individuals or entities, making it challenging to distinguish between real and fake content. Deepfakes can be used to manipulate public opinion, spread disinformation, and undermine trust in various domains, such as politics, entertainment, and journalism. To mitigate the risk of deepfakes, it is crucial to develop robust detection and verification techniques and promote digital literacy to help people identify and avoid falling for these deceptive media.

Non-technical Risks

Lack of data transparency

Documenting the data used for model training is crucial for ensuring compliance with legal requirements, such as demonstrating the representativeness of the data and outlining the expected behavior of the model. Insufficient documentation can lead to challenges in explaining the rationale behind model decisions and may result in legal liabilities.

Legal accountability risk for AI refers to the challenge of identifying and holding the appropriate party responsible for the actions and decisions made by artificial intelligence systems, particularly when those systems cause harm or make mistakes. Users of models without clear ownership might find challenges with compliance with future AI regulation

Plagiarism

Plagiarism, whether intentional or unintentional, involves the use of AI models to replicate or mimic the style, structure, and content of existing works without proper attribution or citation. This can lead to issues related to originality, credibility, and intellectual property rights. It is essential to develop and utilize AI models responsibly, ensuring that they are designed to promote creativity and originality while respecting the rights and original works of others.