I used ChatGPT to translate image text when Google's tool…

AlexSecret/Getty Images

One of the fun aspects of my roles as technical columnist and YouTube producer is testing new gadgets as they come out. I’ve been testing an Anycubic Kobra 3 3D printer for a while now, which led to this article.

3D printers use software called a slicer, which converts a 3D model into layers that the 3D printer lays down in molten plastic. An unfortunate new trend is that most of the bigger 3D printer companies have taken an open-source slicer called Orca Slicer and rebranded it for their own use, adding machine-specific code to enable printing with their printers.

Also: This ChatGPT trick can reveal where your photo was taken – and it’s unsettling

Anycubic did this with their Anycubic Slicer Next, which substantially improved their previous home-grown slicer. Anycubic is based in China. While most of the screens for their slicer have been translated into English, the status page still remains in Chinese. I’m sure they’ll update it soon, but I’m testing the new release now.

original — You can always hit the little square in the upper right to zoom in on the images in this article.

Screenshot by David Gewirtz/ZDNET

While most of the status screen is self-explanatory based on context, there were two areas where I really wanted to understand the text. At (1) there are two buttons. I didn’t want to flip the buttons until I knew what they meant. And there was a big red notice at (2). Is this an important warning I need to pay attention to?

Google Translate

Typically, when I need something translated, I just drop it into Google translate. But I’ve always used Google Translate for pasting in text. In this case, I couldn’t paste in the text, so I clicked the Images option and got an upload screen.

google-upload — Screenshot by David Gewirtz/ZDNET

I then uploaded the screenshot I showed you above (the original didn’t have the green numbers), and fed it to Google.

I got back the following screen.

As you can see, Google Translate replaced most of the Chinese text with English text. I could tell that the two switches control the print head light and the camera light, respectively.

Unfortunately, the red warning text was completely unreadable, even after zooming in at 700%.

700percent — Screenshot by David Gewirtz/ZDNET

That was a bummer, so I decided to try ChatGPT Plus. The results were decidedly mixed.

ChatGPT Plus text output

I used the little plus button in ChatGPT Plus and fed it my screenshot. Almost immediately, I got back a page describing each Chinese string and its English counterpart.

chatgpt-text — Screenshot by David Gewirtz/ZDNET

I noticed two things. First, at (1), ChatGPT informed me that a firmware update was required. Google Translate had ignored the blue text on the original and hadn’t provided that block of text at all in its re-rendered screenshot.

Second, at (2), ChatGPT did, in fact, translate the red warning message. Basically, it said that if you move the print head around on the machine by hand, you should watch what you’re doing. It’s a helpful recommendation, but the red block of text I’d worried about for months wasn’t anything I needed to worry about.

And then, at the bottom, ChatGPT offered to overlay the English translation on the original screenshot. This I had to see.

ChatGPT Plus loses the thread

ChatGPT had already accomplished what I needed from it, so the rest of this was simply curiosity in my role as an AI investigator. I responded to ChatGPT’s invitation by prompting it with, “Yes, please overlay the annotations on the screenshot visually.”

I got back this.

As you can see, ChatGPT very thoughtfully placed overlays on the original screenshot. It just did so using blue Chinese lettering instead of providing English translations. Some, like in the top middle of the screen (at 1), were similar to the Chinese characters already there. Some, like in the lower right of the screen (at 2) where that red warning message had been, placed blue Chinese letters with far fewer symbols than the original text.

Interestingly, ChatGPT also redid the image. The English text (at 3) that had originally read “Body1_PLA_0.2_52m49s.gcode” was modified to “Baby_PLA3_FullNoo.gcode”. The four spools of filament were replaced by three spools (at 4) and were recolored (at 5).

Also: Apple’s bold idea for no-code apps built with Siri – hype or hope?

Ever the optimist, I decided to give ChatGPT another chance. I prompted it with “Please try that again, overlaying English translations to the Chinese lettering.”

And, well, I got something back.

This one showed all four filament spools, so there’s that. It did replace the Chinese characters with English words, but it left out the red prompt I was curious about, as well as the firmware update notice.

But what I want to point your attention to is the camera view. If you compare closely to the original, you’ll realize that ChatGPT redid the actual photo. The photo, as it turns out, didn’t have anything needing to be translated. The image on the left is the original. The image on the right is ChatGPT’s reinterpretation.

Let’s run down the list of what ChatGPT changed:

Green arrow: The Kobra 3 device name, which was in a stencil-like font with slash shapes, was replaced with just the word Kobra.
Orange arrow: Not at all clear what ChatGPT decided to do to my plug array.
Cyan arrow: The print head, which is a cube, has become a flat object on the new image.
Yellow arrow: The object being printed changed shape considerably, from an item with tree supports to something that looks like a golden pedestal.
Purple arrow: “Sided PEI Sheet” turned into “Serial PEI Shoot.”
Red arrows: The labels were moved and changed.
Magenta arrow: The stuff across the room and the door were changed.

Also: OpenAI’s most impressive move has nothing to do with AI

So, yeah. That happened.

AI giveth and AI loseth its mind

On one hand, we could say that ChatGPT did give me what I wanted, which was the translation of the red warning notice and what those buttons did, where Google made it impossible to see the warning notice text. In that context, ChatGPT won and Google lost.

But was it a fluke that ChatGPT gave me a text-only translation first? Because if ChatGPT had given me only one or the other of the screenshots it gave me, we’d have to say Google won, not because Google gave me what I wanted, but because ChatGPT lost its furry little mind.

There is a lot about generative AI that’s really cool. But every so often, we also get some serious head scratchers. I did get my answer, but I also got to look inside a very disordered AI brain.

Also: How to use ChatGPT: A beginner’s guide to the most popular AI chatbot

This is a fun job.

Have you tried using AI tools like ChatGPT or Google Translate to decipher text in images? What worked best for you? Have you encountered any strange or unexpected results like the ones described here? What’s your go-to tool for translating or analyzing interface elements in other languages? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

Want more stories about AI? Sign up for Innovation, our weekly newsletter.

Source link

I used ChatGPT to translate image text when Google's tool failed me – and things got weird

Google Translate

ChatGPT Plus text output

ChatGPT Plus loses the thread

AI giveth and AI loseth its mind

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

Google Translate

ChatGPT Plus text output

ChatGPT Plus loses the thread

AI giveth and AI loseth its mind

VMWARE

Configuration Templates