How Google Keeps Company Data Safe While Using Generative AI Chatbots
Google’s Behshad Behzadi weighs in on how to use generative AI chatbots without compromising company information.
Google’s Bard, one of today’s high-profile generative AI applications, is used with a grain of salt within the company. In June 2023, Google asked its staff not to feed confidential materials into Bard, Reuters found through leaked internal documents. It was reported that engineers were instructed not to use code written by the chatbot.
Companies including Samsung and Amazon have banned the use of public generative AI chatbots over similar concerns about confidential information slipping into private data.
Find out how Google Cloud approaches AI data, what privacy measures your business should keep in mind when it comes to generative AI and how to make a machine learning application “unlearn” someone’s data. While the Google Cloud and Bard teams don’t always have their hands on the same projects, the same advice applies to using Bard, its competitors such as ChatGPT or a private service by which your company could build its own conversational chatbot.
Jump to:
How Google Cloud approaches using personal data in AI products
Google Cloud approaches using personal data in AI products by covering such data under the existing Google Cloud Platform Agreement. (Bard and Cloud AI are both covered under the agreement.) Google is transparent that data fed into Bard will be collected and used to “provide, improve, and develop Google products and services and machine learning technologies,” including both the public-facing Bard chat interface and Google Cloud’s enterprise products.
“We approach AI both boldly and responsibly, recognizing that all customers have the right to complete control over how their data is used,” Google Cloud’s Vice President of Engineering Behshad Behzadi told TechRepublic in an email.
Google Cloud makes three generative AI products: the contact center tool CCAI Platform, the Generative AI App Builder and the Vertex AI portfolio, which is a suite of tools for deploying and building machine learning models.
Behzadi pointed out that Google Cloud works to make sure its AI products’ “responses are grounded in factuality and aligned to company brand, and that generative AI is tightly integrated into existing business logic, data management and entitlements regimes.”
SEE: Building private generative AI models can solve some privacy problems but tends to be expensive. (TechRepublic)
Google Cloud’s Vertex AI gives companies the option to tune foundation models with their own data. “When a company tunes a foundation model in Vertex AI, private data is kept private, and never used in the foundation model training corpus,” Behzadi said.
What businesses should consider about using public AI chatbots
Businesses using public AI chatbots “must be mindful of keeping customers as the top priority, and ensuring that their AI strategy, including chatbots, is built on top of and integrated with a well-defined data governance strategy,” Behzadi said.
SEE: How data governance benefits organizations (TechRepublic)
Business leaders should “integrate public AI chatbots with a set of business logic and rules that ensure that the responses are brand-appropriate,” he said. Those rules might include making sure the source of the data the chatbot is citing is clear and company-approved. Public internet search should be only a “fallback,” Behzadi said.
Naturally, companies should also use AI models that have been tuned to reduce hallucinations or falsehoods, Behzadi recommended.
For example, OpenAI is researching ways to make ChatGPT more trustworthy through a process known as process supervision. This process involves rewarding the AI model for following the desired line of reasoning instead of for providing the correct final answer. However, this is a work in progress, and process supervision is not currently incorporated into ChatGPT.
Employees using generative AI or chatbots for work should still double-check the answers.
“It is important for businesses to address the people aspect,” he said, “ensuring there are proper guidelines and processes for educating employees on best practices for the use of public AI chatbots.”
SEE: How to use generative AI to brainstorm creative ideas at work (TechRepublic)
Cracking machine unlearning
Another way to protect sensitive data that could be fed into artificial intelligence applications would be to erase that data completely once the conversation is over. But doing so is difficult.
In late June 2023, Google announced a competition for something a bit different: machine unlearning, or making sure sensitive data can be removed from AI training sets to comply with global data regulation standards such as the GDPR. This can be challenging because it involves tracing whether a certain person’s data was used to train a machine learning model.
“Aside from simply deleting it from databases where it’s stored, it also requires erasing the influence of that data on other artifacts such as trained machine learning models,” Google wrote in a blog post.
The competition runs from June 28 to mid-September 2023.