Skip to content

配置说明

项目采用 config.toml + src/core/config.py 双层配置。

配置文件位置

文件职责
config.toml参数数据源(TOML 格式,人工编辑)
src/core/config.pyPython 读取接口与路径常量,供各模块统一调用

主要配置段

[paths] — 目录路径

toml
[paths]
raw_dir = "data/raw"
processed_dir = "data/processed"
evaluation_dir = "data/evaluation"
log_base_dir = "outputs/log"          # 实验跑次目录
reports_base_dir = "outputs/reports"  # 定稿发布目录
figures_dir = "outputs/figures"

[ocr] — OCR 配置

toml
[ocr]
device = "cuda"        # 推理设备:"cuda" / "cpu"
mfr_batch_size = 8    # 公式识别批量大小
render_dpi = 200      # PDF 渲染 DPI

[terms_gen] — 术语结构化生成

toml
[terms_gen]
engine = "api"   # "api" / "local"
api_base = "https://api.deepseek.com/v1"
model = "deepseek-chat"
api_key_env = "API-KEY-TERMS"
local_model_dir = ""
temperature = 0.3

[rag_gen] — RAG 回答生成

toml
[rag_gen]
engine = "api"   # "api" / "local"
api_base = "https://api.deepseek.com/v1"
api_model = "deepseek-chat"
api_key_env = "API-KEY-RAG"
local_model_dir = "../Qwen3.5-4B"
temperature = 0.7

[retrieval] — 检索参数

toml
[retrieval]
recall_factor = 10
rrf_k = 60
bm25_default_weight = 0.7
vector_default_weight = 0.3
default_vector_model = "BAAI/bge-base-zh-v1.5"

[reports_generation] — 报告生成

toml
[reports_generation]
report_footer_note = "..."
defense_output_subdir = "defense"

Python 配置接口

python
from core import config

# 路径常量
config.RAW_DIR
config.PROCESSED_DIR
config.LOG_BASE_DIR
config.REPORTS_BASE_DIR

# 配置读取函数
config.getPathsConfig()
config.get_ocr_config()
config.getRetrievalConfig()
config.getGenerationConfig()
config.getOutputController()

详细 API 说明见 config.py 文档

Math-RAG 毕业设计项目