Chinese GenAI Startup DeepSeek Sparks Global Privacy Debate
The year-old Chinese startup DeepSeek took the world by storm when it launched R1, its new large language model (LLM), but experts are now rising about the risks it poses.
DeepSeek’s breakthrough is that this reasoning model, an AI trained with reinforcement learning to perform complex reasoning, was likely developed without access to the latest Nvidia AI chips due to export sanctions.
Cost is also a major consideration as DeepSeek developed R1 with much less funding than its US-based competitors. Despite this, it allegedly performs as well as OpenAI’s o1 on some AI benchmarks.
Additionally, DeepSeek’s approach is seemingly more open than OpenAI’s, with the Chinese firm making R1 available from the AI development platform Hugging Face under an MIT license. This means it can be used commercially without restrictions and offers its users the behind-the-scenes of how the model “reasons” when interacting with it.
The model’s success was such that the DeepSeek app quickly topped the Apple App Store, surpassing OpenAI’s ChatGPT, and made Nvidia’s shares temporarily sink in the stock markets.
A few days after the launch of R1, Deepseek had to limit registrations, claiming it was hit by a “large-scale cyber-attack.”
First DeepSeek Bans and Investigations
The success of R1 reached US President Donald Trump, who has described DeepSeek as a “wake-up call” for the American tech industry.
However, other governments have warned their populations about the risks posed by the Chinese LLM, particularly concerning data privacy and protection.
Ed Husic, Australia’s Minister for Industry and Science, was among the first to voice concerns about the potential implications for national security and the privacy of Australian citizens.
He told ABC News on January 28 that there remained a lot of unanswered questions around DeepSeek, including over “data and privacy management.”
The same day, the US Navy banned the use of DeepSeek due to “potential security and ethical concerns associated with the model’s origin and usage,” according to CNBC.
Also on January 28, the Italian Data Protection Authority (Garante per la protezione dei dati personali) issued an information request to DeepSeek regarding “possible risks to the data of millions of people in Italy.”
“The Authority, considering the possible high risk to the data of millions of people in Italy, asked [Hangzhou DeepSeek Artificial Intelligence and Beijing DeepSeek Artificial Intelligence] and their affiliates to confirm what personal data is collected, from what sources, for what purposes, what the legal basis for processing is, and whether it is stored on servers located in China,” the message read.
The watchdog also asked what kind of information is used to train the AI system and, if personal data is collected through web scraping activities, to clarify how subscribed and unsubscribed users of the service have been or are being informed about the processing of their data.
DeepSeek must provide answers within 20 days.
The BBC reported that White House press secretary Karoline Leavitt is also considering the national security implications of DeepSeek’s emergence.
DeepSeek’s Privacy Policy
According to DeepSeek’s own privacy policy, the users’ data the start-up collects include:
- IP address
- Keystroke patterns
- Operating system (OS)
- Payment information
- System language
- Chat history, including text and audio input prompts
- Device data model
- Personal information through cookies, web beacons, pixel tags and payment information
Additionally, DeepSeek said it may share information collected through the use of the service with its advertising and analytics partners. These partners may also share data with the startup “to help match you and your actions outside the service.”
Many privacy and security experts highlighted that other GenAI providers and many online services collected this type of data.
However, some of DeepSeek’s competitors impose restrictions on the data they collect.
For instance, Google’s policy specifies that it doesn’t collect data from conversations with Gemini, while OpenAI’s policy ensures that user content is not shared for marketing purposes and that user profiles are not created for ad-targeting.
DeepSeek User Data Stored in China
Another concern regarding Deepseek’s handling of the data it collects is that it is stored in China, as the firm’s privacy policy indicates.
Dan Schiappa, CPO at Arctic Wolf stated: “People are already concerned around how much data social media firms have access to, most recently shown by rulings on TikTok, just imagine what the risks could be with Chinese Foundational models being trained on all your data.”
Recently, Noyb, the Austria-based European Center for Digital Rights, filed complaints against six Chinese companies (AliExpress, Shein, Temu, TikTok, WeChat and Xiaomi) over alleged violations of the EU’s General Data Protection Regulation (GDPR).
In its complaint, the non-profit said, “Given that China is an authoritarian surveillance state, it is crystal clear that it doesn’t offer the same level of data protection as the EU.”
A Forrester spokesperson also warned that DeepSeek’s privacy policy “states it can share this information with law enforcement agencies [and] public authorities at its discretion.”
Bill Conner, CEO of enterprise automation firm Jitterbit and former security advisor to the UK and US governments, commented: “Proactive and privacy-minded enterprises should do strict due diligence with all LLMs and AI services, not just DeepSeek.”
He noted that because DeepSeek is a shared cloud service run in China with data being stored in China this potentially introduces unknown risks to data privacy, compliance mandates and security controls.
“AI innovation is moving at a rapid pace. Are CEOs, business leaders and high-placed officials ready to jeopardize the sanctity of their data without the proper cautions? Enterprises will want to jump on the latest AI technology to keep pace, but they must remain prudent for long-term sustainability,” he added.
Luiza Jarovsky, researcher and founder of the AI, Tech & Privacy Academy nonprofit, said on Bluesky she doesn’t think DeepSeek “will last long in the US.”
Photo credit: Koshiro K/Poetra.RH/Shutterstock