Practical Measures for AI and LLM Security: Securing the Future for Enterprises
Artificial intelligence (AI) and large language models (LLMs) have transitioned from novel innovations to essential tools in enterprise environments. Yet with AI technologies constantly advancing and new security issues emerging, companies face a considerable learning curve as they steer through a sea of conflicting information and popular myths around AI.
Bishop Fox hosted a fireside chat with Damián Hasse, chief information security officer at Moveworks, Emily Choi-Greene, security and privacy engineer at Moveworks, and Rob Ragan, principal researcher at Bishop Fox. The Pragmatic AI and & LLM Security Mitigations for Enterprises webcast highlights the significant advancements, challenges, and security considerations for enterprises today, with pragmatic advice and real-life examples shared from industry leaders.
Explore the webcast highlights in this blog, and don’t miss the opportunity to view it on demand.
The Evolution of AI in Enterprise Solutions
The use of AI in enterprise technology has evolved tremendously from rudimentary chatbots to sophisticated, flexible solutions. Early bots, limited by rigid programming — think airport check-in kiosks with “pick A, B, or C” functionality — have given way to advanced systems capable of understanding and acting on complex queries.
Earlier machine learning (ML) tools were discriminative models. This means they take an input and classify it according to a set of defined options. Classifications can be based on user intent (what are they trying to do?), binary answers, or identifying key entities in a user sentence (e.g. paycheck in “I’m looking for my paycheck”).
Generative ML models take it several groundbreaking steps further. Unlike discriminative models, generative ML models understand language and use it to generate brand new text, such as a summary, rephrasing, or synthesis.
Discriminative and generative models are not mutually exclusive. Enterprises can use discriminative and generative models together; for example, when analyzing customer questionnaires to empower its sales team to ask more targeted follow-up questions. Companies can extract data according to defined criteria (discriminative) and generate a synthesized overview (generative), including citations which allow viewing of the source data.
AI has huge potential to revolutionize enterprise operations, yet integrating these technologies is fraught with challenges. Common issues hindering the adoption and effectiveness of AI solutions include availability, cost, and latency.
A recommended approach to overcoming these hurdles includes actively managing user expectations. For example, to address latency issues AI solutions can provide users with real-time feedback on steps being taken to respond to their request. Automated status updates such as “I’m searching our knowledge base” may be helpful to show users the AI solution is processing a request even if the results do not immediately populate.
Navigating AI Security Risks
Significant security risks associated with AI models are data poisoning, overfitting, and data leakages.
Data poisoning
Traditionally, data poisoning involves information entering a training data set that skews the way the model behaves. But given the vast size of the data sets that large language models are trained on, this type of data poisoning is relatively uncommon in LLMs.
What occurs more often, however, is ML models surfacing information from internal data sources that have been accidentally poisoned, due to incorrect or outdated data. While this can lead to errors, it has some unexpected benefits such as helping customers identify knowledge-based articles that need updating.
Overfitting
Overfitting restricts a model’s ability to generalize from its training data. This can occur because the training data set is heavily biased to specific examples, or from using very similar evaluation and training data.
As with data poisoning, overfitting is less of a risk with LLMs due to the sheer size of their training data sets. A recommended strategy to mitigate both these risks involves using diverse datasets and drawing on trusted sources, ensuring models are both accurate and resilient.
Data leakage
Data leakage threats involve unintended information disclosure. Masking data, differential privacy, and separating usage from training help curb risks, although vigilance remains key as landscapes change rapidly. Data leakages can also occur due to flawed processes which involve, but are not caused by, AI.
Our experts noted how critical it is to put controls in place to prevent an AI model from making security decisions for itself or for other AI models. This helps avoid confused deputy attacks, which occur when an entity manipulates another higher-privileged entity to perform unauthorized actions.
Moveworks tackles this problem by preventing the AI/ML models behind its own product from determining who a user is or what actions they can take. This is instead determined by user identities and their associated user entities to minimize possibilities of the AI models making decisions that incur security risks.
Differential Privacy: A Closer Look
Differential privacy stands at the forefront of data protection techniques in AI, providing a framework for using data in model training without compromising individual privacy.
Put simply, differential privacy means giving someone plausible deniability that their data was included in a training data set. Adding a controlled amount of random noise to the data set means there is no way to attribute data to a specific individual or group, yet the model can still generalize well.
This proactive stance is pivotal for companies wanting to leverage AI and LLMs without compromising data integrity.
Enterprise Considerations for Adopting AI
The decision between developing in-house AI capabilities and partnering with third-party AI vendors is critical for enterprises. This choice hinges on a thorough understanding of the security, data handling, and privacy policies associated with AI technologies.
Building an internal AI program is a considerable undertaking, requiring significant dedicated resources to ensure secure infrastructure, development, and deployment. Using third parties outsources these efforts but requires meticulously considered vendor selection and management. CISOs must evaluate vendors' data handling, access controls, deletion processes, audit capabilities, and compliance.
While on-premises options are sometimes seen as more secure than cloud-based solutions, this is not necessarily the case. Whatever setup is chosen for an organization, proper threat modeling remains essential for a strong security posture.
Our experts highlight the value of drawing comparisons between established web security concepts and AI security to help security teams connect their existing knowledge to this new space. For example, showing how a confused deputy attack is like server-side request forgery (SSRF) offers a sensible analogy for security teams seeking to adopt or build AI technology.
Conclusion
As AI continues to shape the future of enterprise technology, taking a pragmatic, informed approach is key to leveraging its potential while minimizing security risks. It is imperative that we remain vigilant, leveraging lessons learned to foster a secure and sustainable AI-driven future for enterprises worldwide.
To dive deeper into this topic and learn more about navigating the complex terrain of AI adoption, make sure to listen to the fireside chat.
Subscribe to Bishop Fox's Security Blog
Be first to learn about latest tools, advisories, and findings.
Thank You! You have been subscribed.
Recommended Posts
You might be interested in these related posts.
Sep 17, 2024
Navigating DORA Compliance: A Comprehensive Approach to Threat-Led Penetration Testing
Aug 28, 2024
Offensive Security Under the EU Digital Operational Resilience Act (DORA)
Aug 13, 2024
Manipulating the Mind: The Strategy and Practice of Social Engineering
Aug 01, 2024
Adversarial Controls Testing: A Step to Cybersecurity Resilience