Sign up for our daily and weekly newsletters for the latest updates and exclusive content on the industry’s best AI coverage. Learn more
Every month, more than 500 million people trust Gemini and ChatGPT to provide information about everything from pasta to sex to homework. But if an AI tells you to cook pasta with gasoline, you probably shouldn’t take advice about birth control or algebra either.
At the World Economic Forum last January, OpenAI CEO Sam Altman made it clear: But I can ask you to explain your reasoning and decide whether it sounds reasonable to me or not. … I think our AI systems could do the same thing. They will be able to explain to us the steps from A to B, and we can decide if we think those are good steps.”
Knowledge needs justification
It’s no surprise that Altman wants us to believe that large-scale language models (LLMs) like ChatGPT can produce transparent descriptions of everything they say. Without a legitimate basis, nothing that humans believe or suspect to be true can become knowledge. Why not? Well, think about when you feel comfortable saying you know something positively. Perhaps it is when you feel absolutely confident in your beliefs because they are well supported by evidence, arguments, and testimony from trusted authorities.
LLM stands for Trustworthy Institutions. Trustworthy information provider. But if they cannot explain their reasoning, we cannot know whether their claims meet our criteria for justification. For example, let’s say today’s Tennessee fog was caused by wildfires in western Canada. I might take your word for it. But suppose yesterday you seriously swore to me that snake fighting was a routine part of thesis defense. Then I know you can’t be completely trusted. So, you might ask, why do we think the smog is caused by Canadian wildfires? For my beliefs to be justified, it is important to know that your report is trustworthy.
The problem is that today’s AI systems cannot earn our trust by sharing why they say what they say. Because there is no such inference. The LLM isn’t even designed to be remote. to reason. Instead, models are trained on vast amounts of human writing to detect and predict or scale on complex patterns in language. When the user enters a text prompt, the response is simply the algorithm’s prediction of how the pattern will continue. These outputs convincingly mimic what (increasingly) knowledgeable humans might say. However, the underlying process has absolutely nothing to do with whether the output is true or not. As Hicks, Humphries, and Slater said, “ChatGPT is bullshit,” and LLM is “designed to produce text that appears to fit the truth without any real interest in the truth.”
So if AI-generated content isn’t the artificial equivalent of human knowledge, then what is? Hicks, Humphries and Slater are right to call it bullshit. Nonetheless, much of what LLM spits out is true. When these “bullshit” machines produce factually accurate output, they produce what philosophers call results. gettier case (after philosopher Edmund Gettier). These cases are interesting because of the strange way they combine true beliefs with ignorance about the justification of those beliefs.
AI output can be like a mirage
Consider the following example from the writings of the 8th-century Indian Buddhist philosopher Dharmottara. Imagine we are looking for water on a hot day. We suddenly see water, or so we think. In fact, what we see is not water but a mirage, but when we arrive, we are lucky enough to find water under a rock. Can we say that we ever had real knowledge of water?
People widely agree that whatever knowledge there is, the travelers in this example do not have it. Instead, they had the good fortune of discovering water precisely where there was no good reason to believe they would find it.
The problem is that whenever we think we know what we learned in LLM, we put ourselves in the same position as the traveler in Dharmottara. If the LLM was trained on high-quality data sets, that claim would likely be true. These claims can be likened to a mirage. And evidence and arguments to justify that claim probably exist somewhere in the data set. It’s as if the water gushing out from under the rock turns out to be real. However, the legitimate evidence and arguments that may have existed played no role in the outcome of the LLM. It is as if the presence of water played no role in creating the illusion that supported the traveler’s belief that he would find it there.
Altman’s assurances are therefore very misleading. What happens if I ask LLM to justify their output? That won’t give you any real justification. This would provide a Gettier justification. Natural language patterns that persuasively mimic justifications. Chimera of justification. As Hicks et al say, it’s a justification for bullshit. As we all know, this is not justified at all.
Currently, AI systems regularly confuse or “hallucinate” by having their masks keep slipping. But as the illusion of justification becomes more persuasive, one of two things will happen.
For those who understand that true AI content is one big Gettier case, the patently incorrect claim that LLM explains its reasoning will undermine its credibility. We will find that AI is being intentionally designed and trained to be systematically deceptive.
And those of us who don’t know that AI spits out Gettier justifications, i.e. fake justifications? Well, we would be fooled. To the extent that we rely on LLM, we will be living in a kind of quasi-matrix, unable to sort fact from fiction and unaware that there might be a difference.
Each output must be justified.
When assessing the significance of this predicament, it is important to keep in mind that there is nothing wrong with LLMs doing things their own way. They are amazing and powerful tools. And people who understand that AI systems spit out Gettier cases instead of (artificial) knowledge are already using LLMs in ways that take this into account. Programmers use an LLM to draft code and then use their coding expertise to modify the code according to their own standards and purposes. Professors use LLM to draft a thesis prompt and then revise it according to their own educational goals. This election season, any speechwriter worthy of the name will fact-check the heck out of AI-written drafts before candidates take the stage. etc.
But most people turn to AI precisely where they lack expertise. Think of teenagers studying algebra or prevention. Or older people seeking advice on diet or investments. If LLMs are to mediate public access to this kind of sensitive information, they must at least know whether and when we can trust them. And to gain trust, we need to know what the LLM cannot tell us: whether and how each outcome is justified.
Fortunately, you know that olive oil is much more effective than gasoline when cooking spaghetti. But is there some dangerous recipe for reality that you swallow whole without even tasting its legitimacy?
Hunter Kallay is a doctoral student in philosophy at the University of Tennessee.
Dr. Kristina Gehrman is an Associate Professor of Philosophy at the University of Tennessee.
data decision maker
Welcome to the VentureBeat community!
DataDecisionMakers is a place for professionals, including technical people, who work with data to share data-related insights and innovations.
If you want to read about cutting-edge ideas, latest information, best practices, and the future of data and data technology, join DataDecisionMakers.
You might also consider contributing your own article!
Learn more at DataDecisionMakers