Program‑Aided Language Models(PAL,程式輔助語言模型)
- PAL(Program‑Aided Language Models) 是一種將 程式碼生成與語言模型推理結合 的提示技術,由 Gao 等人於 2022 年提出。
- 與 Chain‑of‑Thought(CoT)不同,PAL 的中間推理步驟是透過程式碼執行進行,最終讓程式運行輸出結果,由 interpreter(如 Python)完成最終解答。
目錄
PAL 的運作原理
- 輸入自然語言問題:使用者提出任務或問題描述。
- 生成程式碼推理步驟:LLM 將任務轉譯為程式邏輯(使用 Python 等語言),作為中間推理流程。
- 程式執行計算:外部程式執行推理步驟,呈現中間計算結果。
- 綜合輸出答案:LLM 根據程式運算結果,生成最終完整回答。
應用範例
- 數學推理題:如 GSM8K 數學題庫,PAL 生成 Python 程式進行運算,結果更準確,表現優於大型模型的 CoT 方法,正確率提升約 15%。
- 符號/算法推理:在 BIG‑Bench Hard 等挑戰性數據集上,PAL 結合 code 執行與語言理解,整體效率與精度更高。
例子:
#安裝相關外掛
import openai
from datetime import datetime
from dateutil.relativedelta import relativedelta
import os
from langchain.llms import OpenAI
from dotenv import load_dotenv
#配置環境
load_dotenv()
# API configuration
openai.api_key = os.getenv("OPENAI_API_KEY")
# for LangChain
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
llm = OpenAI(model_name='text-davinci-003', temperature=0)
#設置提示+問題
question = "Today is 27 February 2023. I was born exactly 25 years ago. What is the date I was born in MM/DD/YYYY?"
DATE_UNDERSTANDING_PROMPT = """
# Q: 2015 is coming in 36 hours. What is the date one week from today in MM/DD/YYYY?
# If 2015 is coming in 36 hours, then today is 36 hours before.
today = datetime(2015, 1, 1) - relativedelta(hours=36)
# One week from today,
one_week_from_today = today + relativedelta(weeks=1)
# The answer formatted with %m/%d/%Y is
one_week_from_today.strftime('%m/%d/%Y')
# Q: The first day of 2019 is a Tuesday, and today is the first Monday of 2019. What is the date today in MM/DD/YYYY?
# If the first day of 2019 is a Tuesday, and today is the first Monday of 2019, then today is 6 days later.
today = datetime(2019, 1, 1) + relativedelta(days=6)
# The answer formatted with %m/%d/%Y is
today.strftime('%m/%d/%Y')
# Q: The concert was scheduled to be on 06/01/1943, but was delayed by one day to today. What is the date 10 days ago in MM/DD/YYYY?
# If the concert was scheduled to be on 06/01/1943, but was delayed by one day to today, then today is one day later.
today = datetime(1943, 6, 1) + relativedelta(days=1)
# 10 days ago,
ten_days_ago = today - relativedelta(days=10)
# The answer formatted with %m/%d/%Y is
ten_days_ago.strftime('%m/%d/%Y')
# Q: It is 4/19/1969 today. What is the date 24 hours later in MM/DD/YYYY?
# It is 4/19/1969 today.
today = datetime(1969, 4, 19)
# 24 hours later,
later = today + relativedelta(hours=24)
# The answer formatted with %m/%d/%Y is
today.strftime('%m/%d/%Y')
# Q: Jane thought today is 3/11/2002, but today is in fact Mar 12, which is 1 day later. What is the date 24 hours later in MM/DD/YYYY?
# If Jane thought today is 3/11/2002, but today is in fact Mar 12, then today is 3/12/2002.
today = datetime(2002, 3, 12)
# 24 hours later,
later = today + relativedelta(hours=24)
# The answer formatted with %m/%d/%Y is
later.strftime('%m/%d/%Y')
# Q: Jane was born on the last day of Feburary in 2001. Today is her 16-year-old birthday. What is the date yesterday in MM/DD/YYYY?
# If Jane was born on the last day of Feburary in 2001 and today is her 16-year-old birthday, then today is 16 years later.
today = datetime(2001, 2, 28) + relativedelta(years=16)
# Yesterday,
yesterday = today - relativedelta(days=1)
# The answer formatted with %m/%d/%Y is
yesterday.strftime('%m/%d/%Y')
# Q: {question}
""".strip() + '\n'
llm_out = llm(DATE_UNDERSTANDING_PROMPT.format(question=question))
print(llm_out)
輸出以下內容:
# If today is 27 February 2023 and I was born exactly 25 years ago, then I was born 25 years before.
today = datetime(2023, 2, 27)
# I was born 25 years before,
born = today - relativedelta(years=25)
# The answer formatted with %m/%d/%Y is
born.strftime('%m/%d/%Y')
再用exec執行llm_out這段python代碼
exec(llm_out)
print(born)
輸出:
02 / 27 / 1998
PAL 的優勢與挑戰
優勢
- 提升推理準確性:以程式執行取代半結構化文字推理,減少計算或邏輯錯誤。
- 模組解耦設計:語言模型負責生成推理程式,計算由 interpreter 處理,職責明確。
- 擴展性強:可套用在數學、算法、格式驗證與資料處理等多種任務。
挑戰
- 需整合 runtime 環境:需搭建程式執行環境,確保安全與兼容性。
- 提示結構複雜:需設計清楚語言轉程式的提示範本,讓 LLM 清楚推理流程。
- 額外資源開銷:執行程式碼需額外 runtime 資源與管理。
結語
- PAL 提供了一種將程式推理與語言生成結合的強大框架,使 LLM 在解決計算和算法任務時更為精確可靠。
- 它利用程式作為中間推理步驟,由外部 interpreter 完成計算,克服 CoT 在複雜運算與邏輯任務上的缺陷。
- PAL 適合用於要求高精度的推理任務,如數學解題、算法分析、格式驗證等場景。
References
Prompt Engineering Guide
Gao et al., (2022)
here
上一篇:Prompting Techniques - 定向刺激提示
下一篇:Prompting Techniques - 推理 + 行動