The risks
IP infringement
GenAI tools are trained on large language models (LLMs). These consist of huge amounts of data that may be protected by IP rights. The scraping and use of this data to train AI and generate its outputs could constitute IP infringements. (In many countries, you can be liable for such an infringement without having knowingly infringed.)
Hallucination
This is where a GenAI tool generates an inaccurate output – which is much more common than many organizations realize. (For example, 2023 research found that chatbots hallucinated in more than a quarter of the outputs they generated.)
Bias
GenAI tools are perpetuating gender and race discrimination, among other types of prejudice, in fields ranging from healthcare to finance and law enforcement. It’s easy to assume all outputs are the product of a reasoned response – but all they do is analyze the data they’re given for recognizable patterns and, where it includes bias, reflect it.
Preventative actions
IP infringement
When using GenAI tools, be as alert to IP issues as developers are. Does the output pose potential IP infringement issues, or expose you to data privacy risks?
Hallucination
Question the responses that GenAI generates to identify hallucinations, and train staff to spot and test for them.
Bias
Be vigilant about the data used by the underlying LLM. Ask what it is, who it references, and what biases and inaccuracies it may contain – and be skeptical about the quality of the end product.
What good data governance looks like
Whoever in your workforce is going to use GenAI tools needs to know what good data governance looks like:
- Ensure they know how to verify outputs and how to report suspect results so the model can be finetuned.
- Put guardrails in place to ensure GenAI is only used for intended tasks.
- Check that the data is relevant to the objectives the organization is pursuing and that it is dependable, diverse, and balanced.
- Decide which risks the organization is comfortable with (potentially a task for the governance committee) and exclude data that could exceed these boundaries.
- Avoid exposing the organization to additional risk if a GenAI use case is of limited value.