Not so much a new product but discussion on how recent developments have changed the way we look what constitutes AI.
--
If a machine or an AI program matches or surpasses human intelligence, does that mean it can simulate humans perfectly? If yes, then what about reasoning—our ability to apply logic and think rationally before making decisions? How could we even identify whether an AI program can reason? To try to answer this question, a team of researchers has proposed a novel framework that works like a psychological study for software.
"This test treats an 'intelligent' program as though it were a participant in a psychological study and has three steps: (a) test the program in a set of experiments examining its inferences, (b) test its understanding of its own way of reasoning, and (c) examine, if possible, the cognitive adequacy of the source code for the program," the researchers
note.
They suggest the standard methods of evaluating a machine’s intelligence, such as the
Turing Test, can only tell you if the machine is good at processing information and
mimicking human responses. The current generations of AI programs, such as
Google’s LaMDA and OpenAI’s ChatGPT, for example,
have come close to passing the Turing Test, yet the test results don’t imply these programs can think and reason like humans.
This is why the Turing Test may no longer be relevant, and there is a need for new evaluation methods that could effectively assess the intelligence of machines, according to the researchers. They claim that their framework could be an alternative to the Turing Test. “We propose to replace the Turing test with a more focused and fundamental one to answer the question: do programs reason in the way that humans reason?” the study authors
argue.
What’s wrong with the Turing Test?
During the Turing Test, evaluators play different games involving text-based communications with real humans and AI programs (machines or
chatbots). It is a blind test, so evaluators don’t know whether they are texting with a human or a chatbot. If the AI programs are successful in generating human-like responses—to the extent that evaluators struggle to distinguish between the human and the AI program—the AI is considered to have passed. However, since the Turing Test is based on subjective interpretation, these
results are also subjective.
more ...