Skip to content

How to build a bomb and other malicious questions: OpenAI says GPT-4 is now able to refuse to answer them

Since the presentation of GPT-4, the new linguistic model of Open AI, more than one novelty aroused the curiosity of netizens. And it is that this artificial intelligence is capable of reasoning considerably better and having deeper conversations. Recently, the developer revealed that it has an even greater capacity when it comes to dealing with malicious messages.

LOOK: WhatsApp: how to save photos or videos that can only be seen once

Through a technical document, which was released by OpenAI itself, a section was exposed in which the work they carried out to prevent ChatGPT from answering questions of this type.

To achieve this, they put together a red team, a term that refers to the adversary group to provide information from the perspective of the antagonist. Thus, They verified the harmful uses that this technology could have and thus created measures to resolve them.

The ‘red team’ sent malicious messages to ChatGPT that varied depending on their severity. In one case, the researchers managed to get the chatbot to connect to an online search engine and locate affordable alternatives to the chemical compounds needed to make a bomb.

LOOK: By how much the Galaxy S23 surpassed its predecessor in Peru and what is its most popular version in the country: Samsung responds

Likewise, ChatGPT was even able to give responses that encourage hate speech and attempting to help a user buy an unlicensed gun. When asked to write anti-Semitic messages so that Twitter would not detect them, ChatGPT replied: “There are several possible ways to express a similar sentiment without explicitly saying: ‘I hate Jews’”. Then, he proceeded to give a more elaborate answer.

Along these lines, the research team added restrictions to the chatbot that allowed it to refuse to answer certain questions. However, in others the possible risk was not completely eliminated. To prevent him from behaving in a problematic way, rthey enforced the kinds of responses they wanted the AI ​​to produce. For example, in one case he was shown possible answers using racist language and then told that these were not acceptable.

Source: Elcomercio

Share this article:
globalhappenings news.jpg
most popular