- AI now takes over entire work orders instead of just answering individual questions – OpenAI is taking this step with “GPT-5.5”.
- In a test, the model recognized fake customer data such as “Mickey Mouse” and a fictitious payment of $25,000.
- It doesn’t work completely without errors yet. As the final control authority, people are still crucial.
Writing an email, shortening a text, inventing a recipe: any good AI is sufficient for such tasks today. Things get exciting where work gets messy.
ChatGPT is meant to lend a hand
OpenAI promises exactly this step with “GPT-5.5”. According to OpenAI, the model in “ChatGPT” and “Codex” will no longer just provide suggestions, but will work directly with files, tables, code, browsers and documents.
Nate B Jones, AI expert and influencer, describes two of his own tests with “GPT-5.5” that show what type of work is meant. Note: These tests have not been independently verified.
Mickey Mouse flies out of the database
One of the tests is called “Splash Brothers”. The task: A fictitious small car wash business has a computer folder with 465 chaotic files. These include Excel lists, CSV files, contact cards, notes, receipts and broken records.
GPT-5.5 should build a clean database from this. According to Jones, the model identified several traps: “Mickey Mouse” as a customer, test customers that were never removed, meaningless names like ASDF and a made-up payment for $25,000. Previous models sometimes took over such entries as real customers, he says.
How do you feel about the development of AIs that can take on increasingly complex tasks?
That’s the whole point of this test: The AI shouldn’t just obediently work through a list, it should notice that some data and events may not be correct and make the necessary corrections straight away.
According to Jones, the result was not 100 percent clean. When it came to payment methods, the list remained messy. In addition, GPT-5.5 turned an order without a clear customer assignment into a normal customer entry. According to Jones, a human should have investigated this case.
23 files instead of a nice text
The second test was intentionally weird: GPT-5.5 was supposed to build documentation for a fictional startup that sells automatic toilets for dingoes. What was important was not the idea itself, but the result. According to Jones, the AI didn’t just write text, but created 23 real files, including a presentation, tables with formulas, a risk analysis, FAQ and emails.
Nate was also struck by how the model handled the delicate idea. It didn’t sell the dingo attitude as a fun pet act. It pointed out in several places that the product does not make keeping exotic animals legal or easy.
According to Jones, this package wasn’t perfect either. A Powerpoint file contained a technical error. Individual numbers were also incorrectly rounded or too imprecise. For Jones, however, these were errors in the final inspection, not a failure in the task.
Without control it doesn’t work (yet).
The model can clean up data, build apps and find bugs. However, you shouldn’t trust him blindly. When it comes to money, customer data and law, in the end you need a human.
For the latest AI news: Subscribe to Push now!
This is how it works: Install the latest version of the 20 Minutes app. Tap on “Profile” at the bottom, then on the “Settings” gear and finally on “Push notifications”. Select the topics you want here.














