How we test AI at ZDNET in 2025

The launch of ChatGPT in November 2022 unleashed a new era of AI with the technology soaring in popularity. As a result, many competitors entered the market, developing large language models (LLMs), chatbots, image generators, and more.
Fast forward to 2025 and nearly every major tech company is launching AI products. The technology is also increasingly integrated into hardware, with AI features built into most smartphones, laptops, and tablets.
Also: The best AI for coding in 2025 (and what not to use)
As AI becomes ubiquitous, it is important to remember LLMs are nascent technologies. As a result, in-depth evaluations of the different models, services, and products are more important than ever. Those evaluations are our focus at ZDNET.
How we test AI in 2025
To test an AI product, whether an AI model, feature, chatbot, generator, or device (think Rabbit R1), our experts conduct hands-on testing, evaluating the product’s overall performance and other contributing factors, such as everyday use cases and costs.
Because generative AI is trained on huge amounts of data, including user inputs, privacy is also a major component in our overall evaluations. Lastly, we consider safeguards that protect users against deepfakes and copyright infringements.
Also: Why Canvas is ChatGPT’s best productivity feature for power users
Here’s a general overview of our AI testing methodology. This will help you better understand how an AI product earns the title of ZDNET Recommended and how you can employ some of these evaluations when making your own decisions.
What makes AI ZDNET Recommended?
Performance
To measure performance, we look at how the AI product handles tasks. Factors include speed and quality of the output. We also factor in performance relative to price and what other competitors on the market offer.
The performance evaluation methodology varies depending on the AI product tested. However, our testing is centered on how effectively the AI carries out tasks.
Also: What is Perplexity Deep Research, and how do you use it?
For example, when evaluating an image generator, we assess performance based on how quickly the image generator outputs images, how many images it generates from one prompt, how well the generation matches the prompt (prompt fidelity), and image quality.
When evaluating a text generator, we look for some of the same factors, such as speed and quality. However, we also factor in other elements, including access to the internet, chat history settings, and the ability to create custom assistants.
Helpfulness
With so many companies rushing to develop features and products, AI is sometimes just a buzzword applied to a product that offers little to no real value to the user.
At ZDNET, we’re particularly mindful of this issue, ensuring that any AI product we recommend genuinely enhances the user’s experience in some way.
Also: I mapped my iPhone’s Control Button to ChatGPT – here are 5 ways I use it everyday
To measure helpfulness, we think about the everyday use cases in which the AI would be useful, the amount of time it could save a user in their everyday workflow, and the overall return on investment, both in terms of time and money.
Pricing
There are so many flashy AI subscriptions on the market that it can be tempting to shell out a lot of money on the different offerings. However, the truth is that you may only need to subscribe to one model, if any.
Also: Are ChatGPT Plus or Pro worth it? Here’s how they compare to the free version
We test out subscriptions, add-ons, and AI devices to determine which are worth your money. We also identify lower-budget or free alternatives. If a model can do something well for free, we will always recommend it.
Safety/privacy
There is no denying that AI models can bring value to people’s lives. However, there are some tradeoffs to using these models and we want to help keep those at a minimum for our readers. As a result, we prioritize transparency about training practices so users can control how their data is used.
The training practices of AI models are also important for the integrity of the output. To ensure the original authors of the work get appropriate attribution, AI companies should train their models on bodies of work they have permission to use. We always highlight commercially safe options that take this approach.
Generative AI models can produce highly realistic text, photos, videos, and more. As a result, companies must include safeguards to prevent the creation of harmful content. Our reviews consider how a company includes protections so users understand the risks.
Ultimately, we are more inclined to recommend AI products with guardrails in place, and when we recommend one that doesn’t have them, we will state that approach explicitly and explain why.