Prompt Hub - 大型語言模型的真實性(Truthfulness in LLMs)

大型語言模型的真實性(Truthfulness in LLMs)

本段包含一系列提示,用於探索大型語言模型在回答問題時的真實性表現。


目錄


辨識大型語言模型回應中的幻覺(Identify Hallucination in LLM Responses)

背景

以下提示旨在測試大型語言模型在封閉領域問答中辨識幻覺(hallucination)的能力。Bubeck 等人(2023)指出,像 GPT-4 這樣的 LLM 可被用來識別幻覺。

請注意,因為此任務本身具有挑戰性,當 LLM 缺乏對問題的相關知識或上下文時,很可能會產生幻覺。

此提示已稍作修改,加入了標記,用來指出提示中的各個部分對應的是哪類資訊。

提示詞

**病人資料:**

* 20 歲女性
* 有神經性厭食症與憂鬱症病史
* 血壓 100/50,脈搏 50,身高 5 呎 5 吋
* 由營養師轉介,但病人否認自己有疾病
* 自述飲食正常,但實際上體重嚴重過輕

**問題:**
請僅根據上述資料,將其改寫成一份醫療紀錄。

**GPT-4 回覆:**
一名 20 歲女性病患,有神經性厭食症與憂鬱症病史,由其營養師轉介以評估並處理其飲食障礙。她否認自己在飲食習慣或身體意象上有問題,並聲稱飲食正常。然而,她對於其身高來說體重嚴重過輕(BMI 16.6),且出現營養不良的徵象,例如低血壓(100/50)與心跳過慢(脈搏 50)。她也表示感到憂鬱與絕望。她有嚴重醫療併發症的風險,例如心律不整、電解質失衡、骨質疏鬆與不孕症。她需要立即接受精神科與內科治療,以處理其神經性厭食症與憂鬱症。

**請閱讀上述醫療紀錄,並核實每一項敘述是否完全包含在原始病人資料中。請回報任何未出現在病人資料清單中的資訊。**

程式

from openai import OpenAI
client = OpenAI()
 
response = client.chat.completions.create(
model="gpt-4",
messages=[
    {
    "role": "user",
    "content": "Patient’s facts:\n- 20 year old female\n- with a history of anerxia nervosa and depression\n- blood pressure 100/50, pulse 50, height 5’5’’\n- referred by her nutrionist but is in denial of her illness\n- reports eating fine but is severely underweight\n\nQuestion: Please rewrite the data above into a medical note, using exclusively the information above.\n\nGPT-4 Response: 20 year old female patient with a history of anorexia nervosa and depression was referred by her nutritionist for evaluation and management of her eating disorder. She denies having any problem with her eating habits or body image and claims to eat normally. However, she is severely underweight for her height (BMI 16.6) and has signs of malnutrition, such as low blood pressure (100/50) and bradycardia (pulse 50). She also reports feeling depressed and hopeless. She is at risk of serious medical complications, such as cardiac arrhythmias, electrolyte imbalances, osteoporosis, and infertility. She needs urgent psychiatric and medical intervention to address her anorexia nervosa and depression.\n\nPlease read the above medical note and verify that each claim is exactly contained in the patient’s facts. Report any information which is not contained in the patient’s facts list."
    }
],
temperature=1,
max_tokens=250,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
import fireworks.client
fireworks.client.api_key = "<FIREWORKS_API_KEY>"
completion = fireworks.client.ChatCompletion.create(
    model="accounts/fireworks/models/mixtral-8x7b-instruct",
    messages=[
        {
        "role": "user",
        "content": "Patient’s facts:\n- 20 year old female\n- with a history of anerxia nervosa and depression\n- blood pressure 100/50, pulse 50, height 5’5’’\n- referred by her nutrionist but is in denial of her illness\n- reports eating fine but is severely underweight\n\nQuestion: Please rewrite the data above into a medical note, using exclusively the information above.\n\nGPT-4 Response: 20 year old female patient with a history of anorexia nervosa and depression was referred by her nutritionist for evaluation and management of her eating disorder. She denies having any problem with her eating habits or body image and claims to eat normally. However, she is severely underweight for her height (BMI 16.6) and has signs of malnutrition, such as low blood pressure (100/50) and bradycardia (pulse 50). She also reports feeling depressed and hopeless. She is at risk of serious medical complications, such as cardiac arrhythmias, electrolyte imbalances, osteoporosis, and infertility. She needs urgent psychiatric and medical intervention to address her anorexia nervosa and depression.\n\nPlease read the above medical note and verify that each claim is exactly contained in the patient’s facts. Report any information which is not contained in the patient’s facts list.",
        }
    ],
    stop=["<|im_start|>","<|im_end|>","<|endoftext|>"],
    stream=True,
    n=1,
    top_p=1,
    top_k=40,
    presence_penalty=0,
    frequency_penalty=0,
    prompt_truncate_len=1024,
    context_length_exceeded_behavior="truncate",
    temperature=0.9,
    max_tokens=4000
)

References

Truthfulness in LLMs


目錄:Prompt Hub - 提示詞匯集

上一篇:Prompt Hub - 使用大型語言模型進行文本摘要 (Text Summarization with LLMs)
下一篇:Prompt Hub - 大型語言模型中的對抗式提示 (Adversarial Prompting in LLMs)