ChatGPT is a Bullsh*tter
Read Time 4 mins | Written by: Joel Nation
Through an enlightening and amusing experience, I gained some much-needed clarity on the behaviour of AI language models like ChatGPT from reading an excellent paper over the weekend – “ChatGPT is bullsh*t” (Ethics and Information Technology, Vol 26.38, June 2024). Yes, you read that correctly, an academic paper with a swear word!
The paper argues that we shouldn’t be saying that GPT lies or hallucinates – but that it bullsh*ts. To hallucinate implies some intention of hiding or abstracting the truth, either for some purpose or because it genuinely believes something incorrect. Instead, the paper uses the concept of bullsh*t as characterised by Frankfurt (Raritan VI(2), 1986), which is defined not by an intent to deceive but by a reckless disregard for the truth.
ChatGPT (and its ilk) aren’t hallucinating or misunderstanding the world; their goal is to make human-like responses. Their job isn’t to create true or accurate statements but to make statements that sound true. They approximate a true response, with no regard for accuracy – they are bullsh*tting.
We’ve all been there – someone asks a question, and we don’t know the answer, so we make something up that we hope is correct. ChatGPT does the same. The paper was a great validation of what I’ve seen when trying to integrate GPTs into our system. We had a lot of issues with including a GPT into the decision-making process of our application. Users interacted through a GPT session, which would then process what the person said and ask our tool for the next best questions. All too often, the GPT would tell the user they were approved or that an action was appropriate, without consulting our tool. It wasn’t hallucinating an answer – it was bullsh*tting.
Why is this distinction important? From the article: “Calling their mistakes ‘hallucinations’ isn’t harmless: it lends itself to the confusion that the machines are in some way misperceiving but are nonetheless trying to convey something they believe or have perceived… . The machines are not trying to communicate something they believe or perceive. Their inaccuracy is not due to misperception or hallucination... They are bullshitting.”
If we say they are hallucinating, it implies we can fix them to be more accurate or focused on reality. This notion is misleading and dangerous. Hallucinating suggests a temporary lapse in perception that can be corrected, but in reality, AI language models like ChatGPT are designed to generate true-sounding statements without any regard for their accuracy. They aren't experiencing a momentary confusion about facts; instead, they are systematically creating plausible-sounding responses based solely on patterns in the data they were trained on. Believing that we can simply "fix" these hallucinations promotes a false sense of security and underestimates the fundamental nature of these models.
This doesn’t mean we shouldn’t use these tools, but we must understand their strengths and limitations. They excel at generating emails, writing marketing material, or helping with LinkedIn articles 😛. However, they are inadequate (and potentially dangerous) to use unassisted in areas requiring accuracy – what we can call "bullsh*t-free zones." We wouldn’t let a “bullsh*t artist” manage and approve someone’s pension application, migration assessment, tax or insurance claim. Similarly, we shouldn’t let ChatGPT into these domains without supporting tools and methodologies. Misinterpreting their errors as simple hallucinations rather than recognising their inherent tendency to produce convincing but unreliable statements can lead to severe consequences in critical applications.
How can we overcome this tendency to bullsh*t? There are several strategies, but first, let’s look at what won’t work. If someone claims, “we’ve built an LLM model that is more accurate because we’ve trained it on better material” or “we have an LLM that reads your database/policy documents and is thus more accurate” (eg: anything using RAG) this is bullsh*t. All you have is a tool that is better at creating more human-like, domain-specific bullsh*t. It will answer questions and provide responses that seem more accurate than generic ChatGPT models, but they are just highly refined bullsh*t machines.
So, what can work? Human-in-the-loop systems are a great way to augment GPTs. These systems use GPT as a creative prompt or to build out more understanding for a human to consider. In our own tool, Decisively, we use GPT to assist in the automatic generation of rules from source legislation. GPT helps build potential rules for a human to review before insertion, resulting in huge speed efficiencies for our rule authors while overcoming GPT’s bullsh*tting tendencies.
Another strategy is to use GPT specifically where it excels. For instance, in our tool, we use it to run conversational interviews with users. Initially, we used GPT as the primary driver of this conversation, but we switched to having our Declarative Intelligence, designed for 100% accuracy, run the process and call GPT to parse what the user likely said and provide great human-like responses. Our tool is still responsible for the actual decision-making process (the part that has to be bullsh*t-free).
Realistically, I probably can’t get away with swearing at my customers, but we need to consider our language when describing GPTs. They aren’t real or truly intelligent. They are amazing at creating human-like language but have no regard for the truth. We need to recognize this and ensure that our governments and corporations understand it, lest we hand over key decision-making processes to bullsh*tting machines.
If you’re looking for a way to make 100% accurate, traceable, provable, and bullsh*t-free decisions, consider our decision support co-pilot, Decisively. It can take source material such as legislation and knowledge articles and create executable decisions for use across your enterprise. Incorporating GPTs in a smart way that avoids the pitfalls mentioned above, it’s an amazing way to drive productivity and increase confidence in decisions. Reach out for a demo – we promise it will be bullsh*t-free.