Generative AI is now woven into our daily professional and personal routines. These assistants answer questions in an instant, draft texts, summarize documents, and save us a considerable amount of time. Yet they can also fall prey to “hallucinations” and may exhibit security vulnerabilities that jeopardize confidential information.
How can we harness these tools while securing their use? Here are some risks and potential solutions.
Cyber threats linked to LLMs
Prompt injection: manipulating instructions to bypass safeguards and access sensitive information. A malicious user can phrase a question or directive in a way that circumvents the model’s built-in guardrails and obtains confidential data. For instance, an employee or third party could, through a sequence of deceptive sentences, coax the assistant into divulging internal sensitive data or trade secrets. More often than not, this manipulation does not require direct access to critical systems, making the risk particularly worrisome.
Data leakage: unintentional disclosure of confidential or strategic information. When an assistant generates responses from internal documents or knowledge bases, there is a risk that confidential information is reformulated and inadvertently shared with unauthorized users. This vulnerability becomes critical in regulated sectors like finance, healthcare, or defense, where even a single disclosure can have heavy financial and legal consequences.
Abuse of APIs and connected systems: hijacking the LLM’s ability to act on internal systems. AI assistants don’t just answer questions—they can interact with systems via APIs or automate certain tasks. If these access points are not properly secured, a malicious actor could exploit the model to perform unauthorized actions, such as altering configurations, deleting data, or triggering sensitive processes.
Exploitably hallucinations: spreading false information that can mislead users or serve as a phishing vector. Language models can “hallucinate,” meaning they generate false but plausible information. If these are not identified, they can mislead employees, skew strategic decisions, or provide a basis for sophisticated phishing attacks. An automatically generated email containing false financial instructions could persuade an employee to transfer funds to a fraudulent third party.
Data poisoning: tampering with models via malicious data to bias their responses. By feeding the model with malicious or biased information, an external actor can alter its responses and influence its behavior. Over time, this can lead to degraded decision quality, incorrect recommendations, or vulnerabilities that can be exploited to harm the business.
Securing the use of generative AI
Isolation and privilege control: strictly limit access to critical systems and data. AI assistants should be isolated from sensitive systems and granted only the minimal access needed for their functions. Limiting their scope reduces the risk of abuse and makes it much harder to exploit weaknesses to reach strategic information.
Filtering and validation of inputs and outputs: detect and block malicious prompts and dangerous responses. To prevent prompt injections and accidental data leaks, it is essential to filter requests sent to the AI and verify the relevance of responses before dissemination. Automated control and validation mechanisms, combined with business rules, help reduce the risk of executing malicious instructions.
Human oversight and business safeguards: validation of critical actions and explicit rules on permitted use. Human review of critical AI actions and the establishment of explicit usage policies ensure that AI remains a tool of assistance rather than an autonomous actor capable of causing losses or incidents. The synergy of human and artificial intelligence is a crucial guardrail.
Logging and monitoring: tracking interactions to quickly identify unusual usage or attack attempts. Traceability is essential. By recording and monitoring all interactions with the AI, organizations can quickly spot suspicious behavior, analyze fraud attempts, and respond before incidents become critical. Real-time monitoring also helps identify usage trends that could indicate vulnerabilities.
Dedicated security testing: audits and simulations to detect vulnerabilities before they’re exploited. Attack simulations and risk assessments help identify flaws before malicious actors exploit them, ensuring that AI deployment remains a productivity lever rather than a source of financial losses.
*Adrien Gendre is Chief Product Officer at Hornetsecurity