Posts Tagged ‘truth’

Testing ChatGPT /2: Large language models conflate conformity with truth

May 3, 2023
  1. Large language models (such as GPT3) generate text based on probability.
  2. They can have no internal concept of truth.
  3. Probabilities are reflections of conformity and are based on weights of existing usage patterns contained within its database.
  4. Training is about giving additional weight as human feedback on the positivity or negativity of what has been generated.
  5. Generated text gives high values to conformity with its own database and with the training (feedback) received. 
  6. High conformance is conflated with truth.

Of course, this is a common feature in human interactions as well. Many people having a particular opinion gives that opinion a higher perceived truth value even if that opinion is utterly wrong or false. The deification of “the many” has warped our notions of what is truth.

But the fuzziness around truth could be a particular issue with large language models and for AI. Truth is not an inherent, internal concept of such models. Truth becomes relative (to the views of the trainers) and conformity is conflated with truth.


ktwop: Do large language models generate probabilistic text?
ChatGPT: That’s correct! Large language models generate text using probability distributions. These models are trained on massive amounts of text data and learn to predict the likelihood of certain words or phrases based on the context of the surrounding text.
When generating text, the model samples from these probability distributions to determine which word or phrase to use next. Because the model is probabilistic, it can generate different outputs for the same input, making the generated text more diverse and interesting.
However, it’s important to note that while large language models like GPT-3 are very good at generating coherent text, they are not perfect and can still produce errors or nonsensical output. It’s also important to be aware of ethical concerns surrounding the use of such models for potentially harmful purposes such as generating fake news or propaganda.
ktwop: How can the output from a language model be given a truth value?
ChatGPT:The output from a language model can be given a truth value by comparing it to a reference or ground truth text. The reference text can be a sentence or a paragraph that is known to be true or false, depending on the task at hand. For example, if the task is to generate a factually correct sentence, the truth value of the output can be determined by comparing it to a reliable source of information, such as a verified news article or a trustworthy database. If the output matches the reference text, it can be considered true, and if it does not, it can be considered false.
Similarly, for tasks such as sentiment analysis or topic classification, the truth value can be determined by comparing the output to a set of labels that are already defined as true or false. For example, if the task is to classify a text as positive or negative, the truth value of the output can be determined by comparing it to a labeled dataset where each text is already classified as positive or negative.
In summary, the truth value of the output from a language model can be determined by comparing it to a reference text or a set of labeled data that is already known to be true or false.
 

 


%d bloggers like this: