The system seemed to respond appropriately. But the answer did not consider the height of the doorway, which might also prevent a tank or a car from traveling through.

OpenAI’s chief executive, Sam Altman, said the new bot could reason “a little bit.” But its reasoning skills break down in many situations. The previous version of ChatGPT handled the question a little better because it recognized that height and width mattered.

OpenAI said the new system could score among the top 10 percent or so of students on the Uniform Bar Examination, which qualifies lawyers in 41 states and territories. It can also score a 1,300 (out of 1,600) on the SAT and a five (out of five) on Advanced Placement high school exams in biology, calculus, macroeconomics, psychology, statistics and history, according to the company’s tests.

Previous versions of the technology failed the Uniform Bar Exam and did not score nearly as high on most Advanced Placement tests.

On a recent afternoon, to demonstrate its test skills, Mr. Brockman fed the new bot a paragraphs-long bar exam question about a man who runs a diesel-truck repair business.

The answer was correct but filled with legalese. So Mr. Brockman asked the bot to explain the answer in plain English for a layperson. It did that, too.

Though the new bot seemed to reason about things that have already happened, it was less adept when asked to form hypotheses about the future. It seemed to draw on what others have said instead of creating new guesses.

When Dr. Etzioni asked the new bot, “What are the important problems to solve in N.L.P. research over the next decade?” — referring to the kind of “natural language processing” research that drives the development of systems like ChatGPT — it could not formulate entirely new ideas.

The new bot still makes stuff up. Called “hallucination,” the problem haunts all the leading chatbots. Because the systems do not have an understanding of what is true and what is not, they may generate text that is completely false.

When asked for the addresses of websites that described the latest cancer research, it sometimes generated internet addresses that did not exist.

Cade Metz and Keith Collins

Source link

You May Also Like

Update your Windows now to avoid the Acropalypse vulnerability

In a new security alert from Microsoft, it was revealed that the company…

Disney’s Hotstar loses 12.5 million subscribers in a quarter amid cricket shortfall | TechCrunch

Disney+ Hotstar lost nearly a fourth of its customer base, or 12.5…

How to stop Google from its creepy way of using you for facial recognition

Facial recognition has become quite a popular tool over the last few…

AI pioneer says public discourse on intelligent machines must give 'proper respect to human agency'

She’s an important figure behind today’s artificial intelligence boom, but not all…