Perplexity Pro's AI absolutely aced my coding tests – but there's a catch


J Studios/Getty Images

A few weeks ago, I ran my standard suite of programming tests against the free version of the Perplexity.ai chatbot. At the end of the article, I offered to run tests against the $20/mo Pro version if enough of you were interested. I did get some requests, so that’s what we’re doing here.

Also: Want a free year of Perplexity Pro? Here’s how to claim it

Like most other pro versions, to use Perplexity Pro, you have to create an account. You can sign in using either Google or Apple auth methods or a SAML sign-in. Alternatively, you can create an account using your email address, which is what I did.

Unfortunately, the site doesn’t seem to give you any way to set a password or any form of multifactor authentication. You’re sent an email with a code, and that’s it. I don’t mind getting an email code, but I’m really disturbed by web apps relying solely on an email code without, at least, a password. But that’s what Perplexity.AI is doing.

Also: 5 reasons why I prefer Perplexity over every other AI chatbot

The other interesting aspect of Perplexity Pro is its cornucopia of AI models. As you can see in the image below, you can choose between a number of different models, based on the kind of work you have. I chose Default to see what that did with the tests. After running the tests, I asked Perplexity Pro what model it used for them, and it told me ChatGPT GPT-4.

models

Screenshot by David Gewirtz/ZDNET

And with that, let’s run some tests.

1. Writing a WordPress plugin

This challenge is a fairly simple programming task for anyone with a modicum of web programming experience. It presents a user interface in the administration dashboard with two fields: one is a list of names to be randomized, and the other is the output.

The only real gotcha is that the list of names can have duplicates, and rather than removing the extra names, its instructions are to make sure the duplicate names are separated from each other.

Also: How I test an AI chatbot’s coding ability – and you can too

This was a real, requested function that my wife needed to use for her e-commerce website. Every month, they do a wheel spin and some people qualify for multiple entries.

Using Perplexity Pro’s default model, the AI succeeded in generating a workable user interface and functional code, providing both a PHP block and a JavaScript block to control the text areas and the randomization logic.

Here are the aggregate results of this and previous tests:

  • Perplexity Pro: Interface: good, functionality: good
  • Perplexity: Interface: good, functionality: good
  • Claude 3.5 Sonnet: Interface: good, functionality: fail
  • ChatGPT using GPT-4o: Interface: good, functionality: good
  • Microsoft Copilot: Interface: adequate, functionality: fail
  • Meta AI: Interface: adequate, functionality: fail
  • Meta Code Llama: Complete failure
  • Google Gemini Advanced: Interface: good, functionality: fail
  • ChatGPT using GPT-4: Interface: good, functionality: good
  • ChatGPT using GPT-3.5: Interface: good, functionality: good

2. Rewriting a string function

For each test, I open a new session with the AI. In this test, I’m asking the AI to rewrite a block of code that had a bug. The code was designed to validate the input of dollars and cents, which should comprise a certain number of digits before the decimal point, a possible decimal point, and two digits after the decimal point.

Also: Yikes! Microsoft Copilot failed every single one of my coding tests

Unfortunately, the code I shipped only allowed integer numbers. After a couple of user reports, I decided to feed the code to the AI for a rewrite. My code uses regular expressions, which are a formulaic way of specifying a format. Regular expressions themselves are fun, but debugging them is not.

In the case of this test, Perplexity Pro did a good job. The resultant validation code properly flagged items that did not fit the format for dollars and cents, allowing up to two digits after the decimal.

Here are the aggregate results of this and previous tests:

  • Perplexity Pro: Succeeded
  • Perplexity: Succeeded
  • Claude 3.5 Sonnet: Failed
  • ChatGPT using GPT-4o: Succeeded
  • Microsoft Copilot: Failed
  • Meta AI: Failed
  • Meta Code Llama: Succeeded
  • Google Gemini Advanced: Failed
  • ChatGPT using GPT-4: Succeeded
  • ChatGPT using GPT-3.5: Succeeded

3. Finding an annoying bug

This test had me stumped for a few hours. Before it was a test, it was a bug in the code for an actual product. The problem was that whatever was going wrong wasn’t related to any obvious logic or language issue.

Also: I asked ChatGPT to write a WordPress plugin I needed. It did it in less than 5 minutes

Being seriously frustrated, I decided to feed ChatGPT the code as well as the error dump and ask it for help. Fortunately, it found what I had done wrong and gave me guidance on what to fix.

The reason I’m including this in the set of tests is because the bug wasn’t in language or logic, it was in knowledge of the WordPress framework. While WordPress is popular, framework knowledge is generally considered the folklore of a programming environment, something passed down from developer to developer, rather than something that would be rigorously learned by a knowledge base.

However, ChatGPT, as well as Perplexity and now Perplexity Pro, did find the problem. The error was due to a parameter calling issue buried in the framework itself. The obvious answer, which you might come up with strictly by reading the error messages generated by the code, was actually wrong.

Also: Uber One subscribers get a free year of Perplexity Pro. Here’s how to claim it

To solve it, the AI had to show a deeper understanding of how all the systems work together, something with Perplexity Pro did successfully.

Here are the aggregate results of this and previous tests:

  • Perplexity: Succeeded
  • Perplexity Pro: Succeeded
  • Claude 3.5 Sonnet: Succeeded
  • ChatGPT using GPT-4o: Succeeded
  • Microsoft Copilot: Failed
  • Meta AI: Succeeded
  • Meta Code Llama: Failed
  • Google Gemini Advanced: Failed
  • ChatGPT using GPT-4: Succeeded
  • ChatGPT using GPT-3.5: Succeeded

4. Writing a script

Well, this is interesting. Perplexity Pro passed this test, but the free version of Perplexity failed when I tested it a couple of weeks ago. So, yay!

But let’s dive into this a bit. The challenge here is that I ask the AI to write a script that intersects three environments: the Chrome DOM (document object model), AppleScript (Apple’s native scripting language), and Keyboard Maestro (a very cool Mac automation tool that’s fairly obscure, but to me, mission-critical).

Most of the AIs failed because they didn’t have any information on Keyboard Maestro in their knowledge bases and, as such, didn’t give the necessary code for the script to do what I wanted.

Also: How to use ChatGPT to write code: What it can and can’t do for you

Only Gemini Advanced and ChatGPT using GPT-4 and GPT-4o passed this test until now. In answering the question, Perplexity Pro provided a Pro Search view. As you can see, the Pro Search view did a search for “Keyboard Maestro AppleScript Google Chrome tabs.” It also used the main Keyboard Maestro forum as a source, which is the best source for getting Keyboard Maestro coding help.

pro-search

Screenshot by David Gewirtz/ZDNET

The result was a success.

Here are the aggregate results of this and previous tests:

  • Perplexity Pro: Succeeded
  • Perplexity: Failed
  • Claude 3.5 Sonnet: Failed
  • ChatGPT using GPT-4o: Succeeded but with reservations
  • Microsoft Copilot: Failed
  • Meta AI: Failed
  • Meta Code Llama: Failed
  • Google Gemini Advanced: Succeeded
  • ChatGPT using GPT-4: Succeeded
  • ChatGPT using GPT-3.5: Failed

Overall results

Here are the overall results of the four tests:

As you can see, Perplexity Pro joins only ChatGPT with GPT-4 and GPT-4o as having a perfect score of 4 out of 4 succeeded. After running my tests, I checked with Perplexity Pro’s AI and it informed me it used GPT-4 to analyze and respond to my tests.

Given that GPT-4/4o is the only AI that nailed all four of my tests before, this makes sense. So far, I haven’t found any other model that can fully and correctly pass all four programming tests.

Also: How to run dozens of AI models on your Mac or PC – no third-party cloud needed

If you choose Perplexity Pro, I can fairly confidently state that it should be able to do a good job of helping you program.

Have you tried coding with Perplexity, Copilot, Meta AI, Gemini, or ChatGPT? What has your experience been? Let us know in the comments below.


You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.





Source link