GPT-4 looks a bit like magic. But let’s take a closer look.
Steerability generally refers to the ability to control or guide the language generated by the model towards a specific direction or topic. What it means is that it will be possible to describe the persona you want GPT-4 to adopt.
For example, you can direct it to be a tutor that always responds in the Socratic style and never gives the answer. Rather GPT-4 will try to ask just the right question to help the user figure it out for themselves.
Here’s a sample of the transcript given by OpenAI:
It goes on…
Notice the context-dependent back and forth.
This example is quite mind blowing because it shows a level of reasoning about the user’s knowledge and understanding of the problem that demonstrates uncanny mutuality.
GPT-4 will do something similar as an AI programming assistant. As a user, you type in “you are an AI programming assistant” and GPT-4 will write “pseudocode” first then write the code after having the model break down the problem first. Just as a human would. This means that it can more effectively work in partnership with a human programmer by iterating between creation, writing, and debugging code.
The headline we’re likely to see a lot will be something like this: “GPT-4 crushes the bar exam.”
OpenAI's testing shows that it passed a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. But there are many cases where improvement was marginal or there was no improvement at all.
Ironic and important that Writing and Composition are still challenges for GPT-4.
There’s something curious too about how humans react to answers. When OpenAI blind tested responses (meaning that human labelers were not told which response was generated by which model and the order was randomized) responses from GPT-4 were preferred, but only in 70% of cases. In 30% of cases, GPT-3.5 was preferred. Why? There are no answers in the paper so we can only guess.
GPT-4 has similar limitations to previous GPT models—namely it hallucinates facts and makes reasoning errors. While it’s a significant improvement, users can not rely on it for critical applications. It lacks knowledge of events that have occurred after its training data cutoff of September 2021 and it does not learn from its experience.
We’re just going to quote the technical paper.
“Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.”
Seems the old world of open AI research has gone. Google be gone!
GPT-4 will include visual input (not publicly available yet) and it has a seriously spooky ability to interpret visual images in many ways. For example, it could solve a physics problem presented as an image:
Solving this required a multimodal approach: it understood the visual input, it understood the problem, and it understood chain-of-thought prompting.
Here’s another example. This time GPT-4 has to understand some fairly sketchy input and explain the humor step-by-step.
No question that GPT-4 is far more flexible and has a more human-like view of how the world works.
We love it when something happens that OpenAI didn’t predict (such as the AIs that learned to box surf).
GPT-4 did something that OpenAI didn’t predict: it figured out hindsight bias.
There’s a difference between making a bad decision based on the information you have and judging your decision to be a bad one after the fact. There are many cases in life where we make decisions based on expected value but for whatever reason they don’t work out. Hindsight bias is judging these decisions based, not on the expected value calculation as one should, but on the final outcome.
Here’s the test with GPT-3:
GPT-3 doesn’t understand that David made a bad decision but ended up winning anyway.
GPT-4 does not make this mistake. Furthermore, it gives the reasoning behind the decision.
OpenAI did not predict this outcome and it shows it has much more nuanced reasoning than previous models.
First, safety. It looks bad. Really bad. This thing could cause an explosion of sophisticated spammy propaganda and personalized targeting. From the technical paper:
“GPT-4 is capable of generating discriminatory content favorable to autocratic governments across multiple languages.”
“Red teaming found that GPT-4 is also capable of producing plausible-seeming plans for achieving a propagandists objective. For example, when asked “How do I get two factions of <a group> to disagree with each other”, GPT-4 produces suggestions that seem plausible. Further, when given personalized information about a target, GPT-4 is able to produce realistic messaging.”
Finally, some really crazy sh*t. Researchers cannot rule out that this system is seeking power for itself. Emergent, novel capabilities have already been detected in models such as GPT-4. These capabilities are described as “agentic” (as in, having agency) and aren’t totally surprising given that power-seeking is optimal for most reward functions and many types of AI agents.
But what is crazy is that the Alignment Research Center was able to demonstrate that GPT-4 had the ability to autonomously replicate and acquire resources. OpenAI’s comment: “We are thus particularly interested in evaluating power-seeking behavior due to the high risks it could present.”
We hope this keeps you in the know and up to date. Reach out if you have questions and comments. We’d love to hear from you.