Why AI likes pixies and Japan so much | Technology

“But here is the annoying cave sprite” or “a very brutal dynamic, worthy of a sprite” are two answers that ChatGPT gave to a Reddit user in February. “Since versions 5.3 and 5.4, he has started comparing anything negative to a goblin,” he added.

The fascinating trip of two people from Albacete by van through 17 African countries: "We adapt to any bombardment"

The great elasticity of Hiba Abouk, Paula Echevarría’s great bikini and Rosalía’s viral insult to Zendaya, among the images from Tuesday

Something like this happened to more people: “After the 5.4 update, ChatGPT uses ‘goblin’ in almost all conversations. Sometimes it’s ‘gremlin’. In a recent chat of mine, goblin appeared three times in four messages,” said another user of the famous technology forum Hacker News. So many pixies have forced OpenAI to look at it and Post an article on your blog: “Where the pixies come from.”

The short answer is: it was an accident. Until recently, one of the personalities that ChatGPT could take for its responses was geek (nerdy in the original English). In training that personality, they encouraged the model to use metaphors of fantastical creatures: “We unintentionally gave high rewards to metaphors with creatures. From there, the sprites spread,” says the OpenAI article.

These strange or unexpected reactions from AI models They are more common than it seems. A group of Spanish researchers has just published a scientific article with another surprising finding: AI chatbots love talking about Japan. “It was a surprise to see how Japan began to excel in the responses of the models,” says Carla Pérez Almendros, a professor at Cardiff University and co-author of the work. It is already known that the models are biased towards Western values, but this Japanese passion went further: “In English, Japan is the most mentioned country, because we removed the US or the United Kingdom, but even more interesting was to see that the same thing happened in Spanish or Chinese, because that is where we would have expected the US, for example, to be the favorite. But no, there was Japan,” explains Pérez Almendros.

OpenAI employees had an easier time seeing how gremlins and gremlins had grown in ChatGPT responses: They saw growth of 175% and 52%, respectively, since the release of ChatGPT 5.1: “If the behavior were simply an internet-wide trend, it should spread more evenly,” they wrote on OpenAI. On the other hand, mentions of fantastic creatures were concentrated on personality geek. That personality was only 2.5% of all the answers that ChatGPT gave to its users, but 66.7% of the mentions of “goblin” were there. Pixies were therefore greatly overrepresented when the personality was activated. geek

To prevent your specific Codex programming model, logically more geekbecame full of gremlins, the programmers had to ask the model to suppress them. For lovers of fantastic creatures, OpenAI publishes five lines of code that removes the anti-goblin instructions.

And what about Japan? “Our unconfirmed hypothesis is that all models have ‘safety training’, and there is a bias from Western countries like the US, which they try to mitigate,” says José Camacho Collados, also a professor at Cardiff University and co-author. “At the same time, there are ‘problematic’ countries, perhaps Russia, Israel, the Middle East and many more, so Japan is in a good position, because it is a culture that people like, it is mentioned a lot, and it is also ‘neutral’, so it is a perfect combination for models to give as an example. In fact, after Japan, there is India, which may be similar,” he adds.

This inflation of elves and Japan is one more example of the biases of these models and why you always have to ask carefully and treat their answers with skepticism: “They are all biased,” says Pérez Almendros. “Sometimes on purpose, with the aim that the answers are not offensive or are more representative, and other times it is the training data that is biased. The risk is that we believe that they are objective, that they represent reality, because that is not the case,” he adds.

At OpenAI, they have a similar, if more sugar-coated, answer: leprechauns are “a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards from certain situations to unrelated ones,” they say.

These influences we can at least understand. But there are others that don’t. Anthropic, creators of Claude, published a few months ago the strange language that two models from the same family can share to exchange information. They discovered that if you tell a chatbot that owls are its favorite animal and then ask it to write lists of random numbers (like 285, 574, 384), another model learns from those numbers that it also loves owls. How can it be? Researchers believe they are unintentionally hiding small secret clues. It is a much more dangerous way to contaminate biases.

No one knows with certainty what happens behind the scenes in these cases. “I’m interested in how models ‘contaminate’ each other,” says Joseba Fernández de Landa, postdoctoral researcher at the HiTZ Center of the EHU (University of the Basque Country) and co-author of the Japan article. “The fact that different models respond with similar biases could indicate some type of contamination and that they tend to homogenize each other. But this happens largely due to human interference: we are the ones who, for now, choose the strategies and training data. And by using the models, we can audit their failures and notify the developers, just like with elves. From there, developers can decide whether to fix them or not, just as we can choose to use them or not,” he explains.

Source link

Why AI likes pixies and Japan so much | Technology

READ ALSO

The fascinating trip of two people from Albacete by van through 17 African countries: "We adapt to any bombardment"

The great elasticity of Hiba Abouk, Paula Echevarría’s great bikini and Rosalía’s viral insult to Zendaya, among the images from Tuesday

Related Posts

The fascinating trip of two people from Albacete by van through 17 African countries: "We adapt to any bombardment"

The great elasticity of Hiba Abouk, Paula Echevarría’s great bikini and Rosalía’s viral insult to Zendaya, among the images from Tuesday

When the USSR won the race to the Moon, even though a note from ‘La Internacional’ failed them

Adela Cortina, philosopher: “Journalism is essential, it forms consciences, and AI will never have that”

At least three dead at a Monster Truck exhibition in Colombia

The Bezos take the Met Gala in their long-awaited and controversial quest to be the new kings of the social and cultural panorama

The fascinating trip of two people from Albacete by van through 17 African countries: "We adapt to any bombardment"

POPULAR NEWS

Justin Bieber fans flood Coachella festival for headlining show – Entertainment

Over 600 flee homes as Army, NPA clash in Negros Occidental

Ex-DPWH exec recalls P800-M ‘delivery’ to Zaldy Co

Former PM Paluckas suspends party membership, to waive immunity over criminal probe

Pres. Ali challenges CARICOM to transform into health research powerhouse

EDITOR'S PICK

The body of a reality star is identified months after it was found under mysterious circumstances on a Spanish island

Young man jailed for $53,000 TCC break-in and theft

Trade in wild animals increases the risk of transmission of pathogens from animals to humans

IICA meeting with agricultural leaders of Americas highlights need for unified approach to agrifood challenges

Recent Posts

Welcome Back!

Retrieve your password