AI and data security: Managing and protecting against risk
AI can enable powerful new user experiences, but what are the data security and privacy risks of using public Large Language Models like ChatGPT?
Since the public launch of ChatGPT in late 2022, use of generative AI and Large Language Models (LLMs) has exploded. As organisations look to incorporate generative AI into their processes to boost productivity, it’s natural for concerns around data security and privacy to come up.
Public models like ChatGPT do come with some inherent risk, but there are ways for companies to utilise generative AI technology without exposing sensitive data, and to place restrictions on the content fed into LLMs.
AI data risks and security concerns
Data is the one of most valuable assets of modern businesses and the risk of constantly sharing proprietary data in public models such as ChatGPT is a risk that many organisations simply cannot take.
The use of generative AI tools like ChatGPT in workplaces poses significant risk of data leaks, as highlighted by Samsung’s recent ban on such platforms after a Samsung engineer accidentally uploaded sensitive internal source code to ChatGPT, exposing proprietary information to the public.
Data shared with the vast majority of generative AI tools is stored on external servers operated by companies like OpenAI, Microsoft, or Google, with no way to retrieve or delete it. This creates long-term vulnerabilities, as leaked data can resurface in responses to other users or be used to train AI models further.
Such breaches compromise intellectual property, leading to financial losses and reputational damage. Samsung’s case underscores concerns that confidential information—from trade secrets to client data—might inadvertently enter public domains. Financial institutions like JPMorgan Chase and Amazon have also restricted AI chatbot use, fearing regulatory penalties if sensitive data is mishandled.
Privacy concerns with major generative AI products
There are some legitimate privacy concerns with LLM based products such as ChatGPT, as these models have a large interest in harvesting users’ data to continually train their models. This is part of the reason why ChatGPT offers subscriptions at price points lower than what is profitable.
Best practice tips for AI and data security: How to protect sensitive data in the AI era
- Ensure your organisation is not sharing any proprietary data with open models, such as ChatGPT. While these tools are compelling, your data is immediately available to any public users and cannot be retrieved or removed as per their terms of service.
- Avoid implementing AI internally unless you have a specialised team with strong expertise in AI systems. Implementing safe and reliable AI agents requires deep expertise, as the technology is very new and not widely understood. There is also an art to tuning LLMs based on specific use cases, which requires domain expertise and experience.
- Start small – don’t aim for the moon initially. Build up internal knowledge on how to use AI agents in small doses. Experiment rather than aiming for a completely perfect and comprehensive AI solution from the get-go. One of the most common use cases (and most successful) is the generation of content grounded on internal knowledge base such as internal documents and content. This approach is very safe and easy to scale.

SGY’s approach to generative AI: Our partnership with ToothFairy AI
At SGY we are investing heavily in AI innovation to enable powerful new user experiences, content generation and personalisation, in partnership with ToothFairyAI.
We specialise in the superannuation and financial services sectors – industries where data security, regulatory compliance and privacy are extremely important. As such, risk prevention and security has been a primary focus of our AI innovation work.
We are currently working on ‘edge AI’ solutions – essentially the ability to host AI models on-device and/or inside an organisation’s infrastructure. ToothFairyAI has from the beginning taken an AI privacy and transparency first policy, with clients able to choose an LLM model of their choice, which can then be fine-tuned around their needs and data on a dedicated and secure instance. Both the underlying model data and the tuned LLM model is proprietary to the client. All data is secured inside a dedicated private and secure hosting environment which is fully hosted in Australia.
As AI models are become increasingly commoditised, open-source models are becoming more popular and more viable as options for risk-averse organisations, as they can be customised according to the client’s specifications. As high-quality AI models become much more accessible, what matters is the secure access to the right learning data, and the secure deployment of AI agents.
Developing secure and innovative AI solutions for finance and superannuation
With ToothFairyAI, we’re leveraging generative AI technology to safely and securely power more efficient content creation and management, user self-service and personalisation, as well as enterprise workflow efficiencies.
Some of the solutions our LLM-based technology can power include:
- Dedicated and private client-specific LLMs, trained on internal client content.
- Client specific multi-modal content
- Personalised guided learning experiences
- Self-service online chatbots
- Internal search engines and real-time knowledge retrieval with natural language interactions
If you’d like to learn more about how this technology could be used to address your organisation-specific needs and objectives, get in touch to organise a call today.