How OpenAI's new ChatGPT agent can do the research for you – access it here


OpenAI

What’s better than an AI chatbot that can assist you with tasks? One that can do them for you. OpenAI continues to build out its AI agents in ChatGPT with the launch of Deep Research. 

Deep Research

On Sunday, OpenAI unveiled Deep Research, an AI agent that can conduct multi-step research for you by pulling a robust amount of information from the web and synthesizing those sources for you in a comprehensive report. Once prompted, Deep Research can work entirely independently; it’s like having a research analyst at your command. 

Powering Deep Research is a version of the OpenAI o3 model optimized for web browsing and data analysis. By leveraging o3’s advanced reasoning capabilities, it can search and interpret massive amounts of content from the web, including texts, images, and more, and then output it in a report targeted to your needs. 

Each report is generated in five to 30 minutes, depending on the task at hand. However, you can work on other tasks during that time, optimizing your workflow productivity. The finished report is output in the chat. In the weeks to come, the agent will also include images, data visualizations, and more. 

Also: How Gen AI means better customer experiences – see one bank’s approach

According to OpenAI, the same work would take humans hours. Furthermore, the agent is meant to be particularly good at finding niche information that would require humans to perform multiple searches.

According to OpenAI, the target audience for Deep Research includes those who do intensive knowledge work in finance, science, policy, and engineering — and who need reliable, thorough research. Every report includes clear citations and a summary of the agent’s thinking so that users can double-check the information for themselves. 

Double-checking a chatbot’s responses is generally good practice, as chatbots are prone to hallucinations. In particular, OpenAI warns that Deep Research “can sometimes hallucinate facts in responses or make incorrect inferences, though at a notably lower rate than existing ChatGPT models, according to internal evaluations.” OpenAI also added that the agent can struggle to distinguish authoritative information from rumors and can fail to convey uncertainty correctly, highlighting the need for human review. 

Performance compared

In the blog post announcing the feature, OpenAI includes the same side-by-side results of GPT-4o versus Deep Research to showcase how the same prompt generates very different results. The ones generated with Deep Research were much more robust and better organized. 

GPT-4o vs Deep Research

Screenshot by Sabrina Ortiz/ZDNET

Deep Research also outperformed GPT-4o on Humanity’s Last Exam, a recently launched AI benchmark exam by Scale AI and the Center for AI Safety (CAIS) that tests various subjects on expert-level questions. Deep Research scored a 26.6% accuracy, outperforming GPT-4o, Grok-2, Claude 3,5 Sonnet, Gemini Thinking, o1, and even o3-mini high, which had just scored the highest score a couple of days prior, as highlighted by OpenAI CEO Sam Altman.  

OpenAI also published Deep Research’s performance results on a series of other evaluations, including GAIA⁠, a public benchmark that evaluates AI on real-world questions and an internal evaluation of expert-level tasks across different areas of deep research. In both, Deep Research had impressive results, even topping the GAIA external leaderboard. 

How to access

Because of the computing power required to run the Deep Research feature, only ChatGPT Pro users can access it at the moment. The $200-per-month subscription includes access to up to 100 queries of an optimized version and other perks such as unlimited access to ChatGPT and Sora and access to Operator, its AI agent feature that can carry out basic browser tasks like reservations. 

ChatGPT Plus and Team users will get access next, followed by Enterprise and then free users. OpenAI shares that it plans to release a faster, more cost-effective version of the feature powered by a model that is smaller but just as efficient. 

Also: How Gen AI means better customer experiences – see one bank’s approach

If you want access to the feature now but don’t want to shell out the $200 per month, Google has a similar feature, also called Deep Research, that is available to all of its Gemini Advanced users through the Google One AI Premium plan that costs $20 per month. 

Back in December, Altman even replied to an X user who asked Altman to “do a deep research feature like Gemini but better,” with “kk,” suggesting that the newly released Deep Research feature is OpenAI’s answer to Google. 

Last week, Microsoft also announced a feature capable of more thorough reasoning called Think Deeper, which allows users to leverage OpenAI’s O1 reasoning model to deliver higher-quality responses to complex prompts. However, unlike Gemini and OpenAI’s Deep Research features, it doesn’t have agentic capabilities or access to the internet. The biggest perk is that the experience is entirely free. 





Source link

Leave a Comment