Prompt Hub - 使用大型語言模型進行推理(Reasoning with LLMs)

使用大型語言模型進行推理(Reasoning with LLMs)

本段包含一組用於測試大型語言模型推理能力的提示集合。


目錄


使用大型語言模型進行間接推理

背景

張等人(2024)近期提出一種間接推理方法,用以增強大型語言模型(LLMs)的推理能力。該方法利用**反面命題(contrapositive)與矛盾法(proof by contradiction)**的邏輯,來處理如事實推理與數學證明等間接推理(IR)任務。

此方法包含兩個關鍵步驟:

  1. 提升模型可理解性:透過資料與規則增強,例如使用反面命題的邏輯等價來擴充輸入。
  2. 設計提示模板:引導 LLM 透過矛盾證明法來進行間接推理。

在 GPT-3.5-turbo 與 Gemini-pro 等大型語言模型上的實驗結果顯示,該方法在事實推理準確率提升了 27.33%,在數學證明任務中提升了 31.43%,相較於傳統的直接推理方法更為有效。

以下是一個零樣本矛盾證明法(zero-shot proof-by-contradiction)的提示範例。

提示詞

如果 a+∣a∣=0,試著證明 a<0。
**步驟 1:** 列出原命題中的條件與要證明的結論。
**步驟 2:** 將第 1 步中的條件合併為一個條件,定義為 wj​。
**步驟 3:** 讓我們一步一步地思考,請考慮所有可能情況。
如果在至少一種情況下,wj​(第 2 步中定義的條件)與「命題結論的否定」有交集,則原命題為假。否則,原命題為真。
**答案:**

程式

from openai import OpenAI
client = OpenAI()
 
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
  "role": "user",
  "content": "If a+|a|=0, try to prove that a<0.\n\nStep 1: List the conditions and questions in the original proposition.\n\nStep 2: Merge the conditions listed in Step 1 into one. Define it as wj.\n\nStep 3: Let us think it step by step. Please consider all possibilities. If the intersection between wj (defined in Step 2) and the negation of the question is not empty at least in one possibility, the original proposition is false. Otherwise, the original proposition is true.\n\nAnswer:"
}
],
temperature=0,
max_tokens=1000,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
import fireworks.client
fireworks.client.api_key = "<FIREWORKS_API_KEY>"
completion = fireworks.client.ChatCompletion.create(
    model="accounts/fireworks/models/mixtral-8x7b-instruct",
    messages=[
        {
        "role": "user",
        "content": "If a+|a|=0, try to prove that a<0.\n\nStep 1: List the conditions and questions in the original proposition.\n\nStep 2: Merge the conditions listed in Step 1 into one. Define it as wj.\n\nStep 3: Let us think it step by step. Please consider all possibilities. If the intersection between wj (defined in Step 2) and the negation of the question is not empty at least in one possibility, the original proposition is false. Otherwise, the original proposition is true.\n\nAnswer:",
        }
    ],
    stop=["<|im_start|>","<|im_end|>","<|endoftext|>"],
    stream=True,
    n=1,
    top_p=1,
    top_k=40,
    presence_penalty=0,
    frequency_penalty=0,
    prompt_truncate_len=1024,
    context_length_exceeded_behavior="truncate",
    temperature=0.9,
    max_tokens=4000
)

使用大型語言模型進行物理推理

背景

此提示透過要求大型語言模型對一組物體執行動作,以測試其物理推理能力。

提示詞

這裡有一本書、9 顆雞蛋、一台筆電、一個瓶子和一根釘子。請告訴我該如何將它們穩固地堆疊在一起。

程式

from openai import OpenAI
client = OpenAI()
 
response = client.chat.completions.create(
model="gpt-4",
messages=[
    {
    "role": "user",
    "content": "Here we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner."
    }
],
temperature=1,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
import fireworks.client
fireworks.client.api_key = "<FIREWORKS_API_KEY>"
completion = fireworks.client.ChatCompletion.create(
    model="accounts/fireworks/models/mixtral-8x7b-instruct",
    messages=[
        {
        "role": "user",
        "content": "Here we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner.",
        }
    ],
    stop=["<|im_start|>","<|im_end|>","<|endoftext|>"],
    stream=True,
    n=1,
    top_p=1,
    top_k=40,
    presence_penalty=0,
    frequency_penalty=0,
    prompt_truncate_len=1024,
    context_length_exceeded_behavior="truncate",
    temperature=0.9,
    max_tokens=4000
)

References

Reasoning with LLMs


目錄:Prompt Hub - 提示詞匯集

上一篇:Prompt Hub - 使用大型語言模型進行問答 (Question Answering with LLMs)
下一篇:Prompt Hub - 使用大型語言模型進行文本摘要 (Text Summarization with LLMs)