LLM-generated text is somewhat unpredictable. It's often good, fluent, and accurate, but sometimes, it's not factual or even unsafe.
Hallucination is text generated by a model that is not grounded by any data the model has been exposed to. In other word, it is non-factual generated text.
Hallucination is especially problematic and dangerous when the model is generating text about a topic that the consumer does not know much about and cannot verify the veracity of easily.
The threat of hallucination is one of the biggest challenges to safely deploying LLMs.
There is no known method that will eliminate hallucination with 100% certainty. On the other hand, there is a growing set of best practices and typical precautions to take when using LLMs to generate text. As one example, there is some evidence that shows that retrieval-augmented systems (RAG) hallucinate less than zero-shot LLMs.
In a related line of work that is growing in popularity, researchers are developing methods for measuring the groundedness of LLM-generated output.
These methods work by taking a sentence generated by an LLM and a candidate's supporting document and outputting whether the document supports the output sentence or not. These methods work by training a separate model to perform Natural Language Inference (NLI), which is the task that's been studied in the NLP community for a long time.