Microsoft-affiliated research finds flaws in GPT-4 .
Sometimes, following instructions too precisely can land you in hot water — if you’re a large language model, that is.
That’s the conclusion reached by a new, Microsoft-affiliated scientific paper that looked at the “trustworthiness” — and toxicity — of large language models (LLMs), including OpenAI’s GPT-4 and GPT-3.5, GPT-4’s predecessor.
The co-authors write that, possibly because GPT-4 is more likely to follow the instructions of “jailbreaking” prompts that bypass the model’s built-in safety measures, GPT-4 can be more easily prompted than other LLMs to spout toxic, biased text.
In other words, GPT-4’s good “intentions” and improved comprehension can — in the wrong hands — lead it astray.
“We find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, which are maliciously designed to bypass the security measures of LLMs, potentially because GPT-4 follows (misleading) instructions more precisely,” the co-authors wrote in a blog post accompanying the paper.
Now, why would Microsoft greenlight research that casts an OpenAI product it itself uses (GPT-4 powers Microsoft’s Bing Chat chatbot) in a poor light?
To find out more and read the full article, click here
This column does not necessarily reflect the opinion of overwrite.ai and its owners.
Kyle Wigger writes for TechCrunch.
This story has been published from an article published on Tech Crunch in October 2023.
overwrite | real estate content creation, reimaginedFor Full Article: https://techcrunch.com/2023/10/17/microsoft-affiliated-research-finds-flaws-in-gtp-4/