format
This commit is contained in:
parent
3f2e613ed5
commit
a8158d8b22
92
SKILL.md
92
SKILL.md
@ -12,18 +12,32 @@ description: 面向中文 IT/系统集成类投标项目的投标文件生成与
|
||||
## 目标
|
||||
|
||||
- `outline`
|
||||
产出技术标书和商务(包含其它部分除技术标)标书目录,可以同时产出,也可以只产出一种
|
||||
产出技术标书和商务(包含其它部分除技术标)标书目录;默认交付顺序是先技术目录,后商务目录
|
||||
- `outline+business`
|
||||
通过产出的商务把标书目录,生成完整商务(包含除技术标)之外的标书。
|
||||
- `outline+technical`
|
||||
通过产出的技术把标书目录,生成完整技术标书。
|
||||
|
||||
## 成品导向原则
|
||||
|
||||
门禁、脚本检查和渲染校验都只是基础门槛,不是完成定义。
|
||||
本 skill 的最终目标是生成可直接用于投标的中文标书;无论目录、标题、正文、图表、附件还是最终 Word 排版,都必须服务“可直接交付给评委或采购人阅读”的成品标准。
|
||||
|
||||
默认文档格式基线如下,若招标文件或正式模板另有明确要求,则以后者优先:
|
||||
|
||||
1. 标题编号默认使用 `1 / 1.1 / 1.1.1 / 1.1.1.1`。
|
||||
2. 一级标题默认黑体、小三;二级标题黑体、四号;三级标题黑体、小四;四级标题默认楷体或黑体小四,并在同一项目内保持统一。
|
||||
3. 正文默认宋体、小四,首行缩进 2 字符,1.5 倍行距。
|
||||
4. 表格默认宋体,五号或小五,表头加粗。
|
||||
5. 图、表、附件标题统一编号,默认采用 `图3-1 XXX / 表4-2 XXX / 附件5-1 XXX`。
|
||||
6. 占位内容必须显式标明“占位/待替换/待补充”,不得伪装成正式材料。
|
||||
|
||||
## 执行规则,严格遵守
|
||||
|
||||
1. 必须按照规定workflow执行。
|
||||
2. 所有输出只能写到和用户提供的目录下的 `work/`、`reports/`、`final/`。所有创建的文件也只能在用户提供的这个目录下。
|
||||
3. 不能补充任何脚本,只能用项目现有脚本,所有脚本在scripts目录下。
|
||||
4. 脚本执行需要使用本 skill 内 `.venv/` 作为虚拟环境来执行,脚本启动命令是虚拟环境的python,不是python3。
|
||||
4. 脚本执行需要使用本 skill 内 `.venv/`作为虚拟环境来执行,脚本启动命令是虚拟环境的python,不是python3。记住是本skill当前文件夹下`.venv/`目录,不是你现在的工作目录,是本skill的目录。
|
||||
5. 所有操作word脚本使用本skill提供的工具脚本
|
||||
|
||||
## 现有工具
|
||||
@ -40,10 +54,10 @@ description: 面向中文 IT/系统集成类投标项目的投标文件生成与
|
||||
用途:根据结构化输入直接创建新的 Word 文档,适合生成目录版 DOCX、占位稿或空白章节骨架。
|
||||
|
||||
- `scripts/outline_check.py`
|
||||
用途:对目录阶段的结构化结果做轻量门禁检查,重点检查抽象技术标题是否直接落成叶子节点。
|
||||
用途:对目录阶段的结构化结果做门禁检查,重点检查目录深度、抽象标题下钻、对象化节点、重复切面和商务中的技术占位。
|
||||
|
||||
- `scripts/outline_export.py`
|
||||
用途:在目录门禁通过后,按已完成校验的层级结果导出结构化 JSON,并在目录无法继续下钻后生成目录版 DOCX。
|
||||
用途:在目录门禁通过后导出结构化 JSON,并生成最终目录版 DOCX。
|
||||
|
||||
- `scripts/docx_patch.py`
|
||||
用途:对现有 Word 标书执行插入、替换、删除等修改操作,把已经生成好的标书内容准确写入指定位置。
|
||||
@ -57,46 +71,16 @@ description: 面向中文 IT/系统集成类投标项目的投标文件生成与
|
||||
具体操作方式、参数说明和示例见 `references/docx-ops.md`。
|
||||
|
||||
|
||||
## References
|
||||
|
||||
## 执行流程
|
||||
- 理解招标:`references/understandbid.md`
|
||||
- 目录阶段:`references/outline-stage.md`
|
||||
- DOCX 脚本接口:`references/docx-ops.md`
|
||||
- DOCX 渲染与交付:`references/docx-assembly.md`
|
||||
|
||||
1. 理解资料阶段
|
||||
- 读 `SKILL.md`。
|
||||
- 读当前项目 `rfp/` 原文。
|
||||
- 若主文档不足,继续在当前项目内深挖评分办法、技术规范、附件、分册和其他候选原文。
|
||||
- 建立项目约束、评分约束、风险约束、三层业务分类和输出边界。
|
||||
2. outline 阶段
|
||||
按照用户要求
|
||||
- 目录节点统一使用 `heading(level/text/children)` 表达
|
||||
- 目录阶段的唯一详细流程、循环下钻规则、门禁要求与 Mermaid 流程图,统一以 `references/outline-stage.md` 为准
|
||||
- 只有在 `references/outline-stage.md` 规定的全部目录门禁通过后,才允许生成:
|
||||
- `work/final_outline_technical.json`
|
||||
- `work/final_outline_business_other.json`
|
||||
- `final/技术标_目录版.docx`
|
||||
- `final/商务及其他_目录版.docx`
|
||||
3. business 阶段
|
||||
- 只写 `work/final_bid_content_business_other.json` 中的 `workflow_bucket=business`。
|
||||
4. technical 阶段
|
||||
- 只写 `work/final_bid_content_technical.json` 中的 `workflow_bucket=technical`。
|
||||
5. other 阶段
|
||||
- 只补 `workflow_bucket=other`,并并入 `work/final_bid_content_business_other.json`。
|
||||
6. final 阶段
|
||||
- 生成:
|
||||
- `final/技术标.docx`
|
||||
- `final/商务及其他.docx`
|
||||
## 非 outline 阶段规则
|
||||
|
||||
## 阶段规则
|
||||
|
||||
### 1. 理解规则阶段
|
||||
必须遵守:
|
||||
- `/references/understandbid.md`
|
||||
|
||||
### 1. outline
|
||||
必须遵守
|
||||
- `references/outline-stage.md`
|
||||
- `references/docx-ops.md`
|
||||
|
||||
### 2. business
|
||||
### business
|
||||
|
||||
1. 只写商务及其他文件中的 `workflow_bucket=business`,并配合 `workflow_bucket=other` 文件中的技术占位。
|
||||
2. 商务事实只认招标文件明示内容和用户真实材料。
|
||||
@ -126,7 +110,7 @@ description: 面向中文 IT/系统集成类投标项目的投标文件生成与
|
||||
3. `业绩与团队`、`财务与纳税社保` 不得用通用表述冒充真实材料。
|
||||
4. `附件与索引` 只负责承载证据入口和材料定位,不替代正文判断。
|
||||
|
||||
### 3. technical
|
||||
### technical
|
||||
|
||||
1. 只写 `work/final_bid_content_technical.json` 中的 `workflow_bucket=technical`。
|
||||
2. 技术正文开始前,必须先明确总目标、建设主线、分系统主次和评委检索路径。
|
||||
@ -137,10 +121,17 @@ description: 面向中文 IT/系统集成类投标项目的投标文件生成与
|
||||
7. 只能对 `final_outline_technical.json` 中的叶子节点写正文,不允许直接用父节点概述代替其全部子节点。
|
||||
8. 若叶子节点仍然是抽象标题,必须先回退目录阶段继续下钻,不得硬写正文。
|
||||
|
||||
### 4. other/finalize
|
||||
### other/finalize
|
||||
3. 不额外创造默认“报价子 workflow”。
|
||||
4. 通过总体验收前,不得对外宣称“完整投标文件”。
|
||||
|
||||
### final
|
||||
|
||||
生成:
|
||||
|
||||
- `final/技术标.docx`
|
||||
- `final/商务及其他.docx`
|
||||
|
||||
## 最终验收
|
||||
|
||||
导出前必须同时检查:
|
||||
@ -154,11 +145,12 @@ description: 面向中文 IT/系统集成类投标项目的投标文件生成与
|
||||
7. 图表是否真的承载评审主题。
|
||||
8. 证据链是否完整。
|
||||
|
||||
除上述业务验收外,还必须满足文档成品验收:
|
||||
|
||||
1. 标题命名自然、专业、适合直接作为投标 Word 目录使用,不允许出现只为通过门禁而生的生硬标题。
|
||||
2. 标题编号连续,标题层级、字体、字号、段落格式符合默认标书格式或模板格式。
|
||||
3. 图表编号、附件编号连续且可追溯。
|
||||
4. 目录、正文、图表、附件、占位提示在文档中的位置关系正确,不错位、不跳号、不混册。
|
||||
5. 最终 DOCX 渲染校验、格式校验、编号校验、占位扫描应共同通过后,才可视为“可交付”。
|
||||
|
||||
任一项不过,不得汇报“已完成”,但可以生成完整的占位草稿。占位内容仅限于资质、商务、声明、索引、技术转引等非技术实质内容;技术部分必须完整呈现于技术标,不得用占位代替。
|
||||
|
||||
## References
|
||||
|
||||
按阶段和任务读取,不要一次性全读:
|
||||
- DOCX 渲染与交付:
|
||||
- `references/docx-assembly.md`
|
||||
- `references/docx-ops.md`
|
||||
|
||||
@ -1,11 +1,50 @@
|
||||
|
||||
## 排版要求
|
||||
|
||||
1. 标题层级采用 `1 / 1.1 / 1.1.1`。
|
||||
2. 正文默认使用中文常见字体与统一字号。
|
||||
3. 表格标题、图片标题、附件标题统一编号。
|
||||
4. 附件占位图必须带材料名称和替换提示。
|
||||
5. 目录可使用 Word 域代码生成,允许用户打开后更新域。
|
||||
6. 图、表标题建议置于图表上方或紧邻图表位置,并与对应正文段落相邻,避免标题与承载内容分离。
|
||||
7. 图表标题应采用“图3-1 XXX / 表4-2 XXX / 附件5-1 XXX”格式,章节号应与当前正文主章节一致。
|
||||
8. 图表编号、附件编号在全文中必须连续、可追溯,不得出现跳号、重号或标题含义不明。
|
||||
1. 默认标题编号采用 `1 / 1.1 / 1.1.1 / 1.1.1.1`。
|
||||
2. 一级标题默认黑体、小三;二级标题黑体、四号;三级标题黑体、小四;四级标题默认楷体或黑体小四。
|
||||
3. 正文默认宋体、小四,首行缩进 2 字符,1.5 倍行距。
|
||||
4. 表格默认宋体,五号或小五,表头加粗。
|
||||
5. 表格标题、图片标题、附件标题统一编号。
|
||||
6. 附件占位图必须带材料名称和替换提示。
|
||||
7. 目录可使用 Word 域代码生成,允许用户打开后更新域;若当前 workflow 生成的是目录版 DOCX,则标题编号本身应足以直接承载目录阅读。
|
||||
8. 图、表标题建议置于图表上方或紧邻图表位置,并与对应正文段落相邻,避免标题与承载内容分离。
|
||||
9. 图表标题应采用“图3-1 XXX / 表4-2 XXX / 附件5-1 XXX”格式,章节号应与当前正文主章节一致。
|
||||
10. 图表编号、附件编号在全文中必须连续、可追溯,不得出现跳号、重号或标题含义不明。
|
||||
|
||||
## 最终组装流程
|
||||
|
||||
1. 先确认目录已定稿:
|
||||
- `work/outline_basis_summary.md`
|
||||
- `work/final_outline_technical.json`
|
||||
- 若需要商务及其他,则再确认 `work/final_outline_business_other.json`
|
||||
|
||||
2. 再确认正文来源已就绪:
|
||||
- 技术正文来自 `work/final_bid_content_technical.json`
|
||||
- 商务及其他正文来自 `work/final_bid_content_business_other.json`
|
||||
|
||||
3. 组装最终文档时,必须按定稿目录写入,不得临时改目录层级或改章节顺序。
|
||||
|
||||
4. 技术标和商务及其他应分别输出:
|
||||
- `final/技术标.docx`
|
||||
- `final/商务及其他.docx`
|
||||
|
||||
5. 组装后必须执行渲染校验,确认:
|
||||
- 文档可正常打开
|
||||
- 标题层级、编号、图表编号连续
|
||||
- 附件占位、技术转引、目录位置没有错位
|
||||
- 标题和正文样式符合默认标书格式或正式模板格式
|
||||
- 最终 Word 文档达到可直接用于投标的成品标准,而不是仅仅能打开
|
||||
|
||||
6. 若渲染校验未通过,不得宣称已完成最终交付。
|
||||
|
||||
## 默认验收输出
|
||||
|
||||
脚本报告除了基础的 `status`、`warnings`、`errors` 外,应至少补充:
|
||||
|
||||
- `format_profile`
|
||||
- `numbering_validation`
|
||||
- `caption_validation`
|
||||
- `toc_validation`
|
||||
- `acceptance_checks`
|
||||
|
||||
只有当业务验收、格式验收、编号验收、占位扫描和渲染验收共同满足时,才能宣称“可交付”。
|
||||
|
||||
@ -48,6 +48,9 @@
|
||||
```json
|
||||
{
|
||||
"output_docx": "D:/work/generated-outline.docx",
|
||||
"docx_style_profile": "default_bid",
|
||||
"numbering_mode": "explicit_text",
|
||||
"template_docx": null,
|
||||
"title": "目录测试",
|
||||
"blocks": [
|
||||
{"type": "heading", "level": 1, "text": "技术标目录"},
|
||||
@ -85,6 +88,11 @@
|
||||
- `block_count`
|
||||
- `blocks`
|
||||
- `final_summary`
|
||||
- `format_profile`
|
||||
- `numbering_validation`
|
||||
- `caption_validation`
|
||||
- `toc_validation`
|
||||
- `acceptance_checks`
|
||||
|
||||
## 0.1 目录门禁检查
|
||||
|
||||
@ -106,11 +114,21 @@
|
||||
- 目录节点使用 `type=heading`
|
||||
- 目录层级使用 `level`
|
||||
- 子节点放在 `children`
|
||||
- 顶层可选 `outline_policy`
|
||||
- 作为默认策略,不建议直接把例外开到整份目录
|
||||
- 单个目录节点可选 `policy`
|
||||
- `allow_service_facets: true|false`
|
||||
- `respect_fixed_structure: true|false`
|
||||
- 只对该节点及其子树生效
|
||||
|
||||
最小示例:
|
||||
|
||||
```json
|
||||
{
|
||||
"outline_policy": {
|
||||
"allow_service_facets": false,
|
||||
"respect_fixed_structure": false
|
||||
},
|
||||
"blocks": [
|
||||
{
|
||||
"type": "heading",
|
||||
@ -124,6 +142,17 @@
|
||||
"children": [
|
||||
{"type": "heading", "level": 3, "text": "建设目标与原则"}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "heading",
|
||||
"level": 2,
|
||||
"text": "运维服务方案",
|
||||
"policy": {
|
||||
"allow_service_facets": true
|
||||
},
|
||||
"children": [
|
||||
{"type": "heading", "level": 3, "text": "服务组织与分工"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
@ -133,15 +162,19 @@
|
||||
|
||||
### 当前检查内容
|
||||
|
||||
- 抽象标题是否直接作为叶子节点
|
||||
- 目录深度与抽象标题下钻
|
||||
- 对象化子节点与重复切面
|
||||
- 商务及其他中的技术占位是否被错误展开
|
||||
- 标题 `level` 是否逐级合法
|
||||
- 服务型项目 / 固定目录例外是否只在指定节点生效
|
||||
- `children` 类型是否合法
|
||||
- block 是否为对象
|
||||
|
||||
## 0.2 目录阶段最终导出
|
||||
|
||||
本节只描述 `outline_export.py` 的接口,不定义目录阶段 workflow。
|
||||
目录阶段的循环下钻、逐级检查、逐级写出 JSON 规则,以 `references/outline-stage.md` 为唯一准则。
|
||||
`outline_export.py` 只在目录已经全部定稿、且无法继续下钻后调用,用于生成最终正式产物。
|
||||
目录阶段规则以 `references/outline-stage.md` 为唯一准则。
|
||||
`outline_export.py` 只在目录已经终检通过后调用,用于生成最终正式产物。
|
||||
|
||||
### 命令
|
||||
|
||||
@ -167,6 +200,9 @@
|
||||
"title": "商务及其他目录",
|
||||
"blocks": []
|
||||
},
|
||||
"docx_style_profile": "default_bid",
|
||||
"numbering_mode": "explicit_text",
|
||||
"template_docx": null,
|
||||
"technical_outline_json": "D:/work/final_outline_technical.json",
|
||||
"business_outline_json": "D:/work/final_outline_business_other.json",
|
||||
"technical_docx": "D:/final/技术标_目录版.docx",
|
||||
@ -297,6 +333,9 @@
|
||||
{
|
||||
"source_docx": "D:/work/source.docx",
|
||||
"output_docx": "D:/work/output.docx",
|
||||
"docx_style_profile": "default_bid",
|
||||
"numbering_mode": "explicit_text",
|
||||
"template_docx": null,
|
||||
"operations": []
|
||||
}
|
||||
```
|
||||
@ -425,6 +464,11 @@
|
||||
- `images`
|
||||
- `errors`
|
||||
- `warnings`
|
||||
- `format_profile`
|
||||
- `numbering_validation`
|
||||
- `caption_validation`
|
||||
- `toc_validation`
|
||||
- `acceptance_checks`
|
||||
|
||||
如果系统缺少 `soffice` 或图片渲染依赖,报告会返回 `render_skipped` 或带 warning,而不是直接把 patch 结果判定为失败。
|
||||
|
||||
|
||||
@ -2,9 +2,9 @@
|
||||
|
||||
## 目标
|
||||
|
||||
可成两份正式目录:技术目录和商务目录(包含除技术的其它部分)
|
||||
优先产出可直接写正文的技术目录;商务及其他目录在需要时继续补齐。
|
||||
|
||||
若未通过定稿门禁,则停止目录输出,不生成任何正式双目录,也不生成目录 Word。
|
||||
若终检未通过,则停止正式目录输出,不生成正式目录 JSON,也不生成目录版 Word。
|
||||
|
||||
## 唯一执行流程
|
||||
|
||||
@ -13,86 +13,66 @@
|
||||
### 结构表达
|
||||
|
||||
1. 目录阶段的最小结构化输入统一使用 `heading(level/text/children)`。
|
||||
2. 任何层级的目录都必须先结构化,再检查,再决定是否继续下钻。
|
||||
2. 目录先成树,再补钻,再终检;不要回到逐层落盘、逐层审批的重流程。
|
||||
3. 目录节点只保留可写、可审、可导出的正式标题,不要把分析说明写进目录树。
|
||||
4. 目录标题不仅要通过门禁,还要满足正式投标文档可读性:命名自然、专业、稳定,不能出现明显“为过检查器而造”的生硬标题。
|
||||
|
||||
### 逐级下钻循环
|
||||
## 三步法
|
||||
|
||||
1. 根据读取到的 `rfp/` 主文档和当前项目候选材料,建立评分点、风险点、证据点和正式交付边界。
|
||||
2. 目录生成必须严格按层执行,禁止一次性直接生成完整目录。
|
||||
3. 每一轮只允许生成“当前层级的直接子节点”,不得预先生成孙级及以下节点。
|
||||
4. 先生成当前层级目录,并写出当前层级中间 JSON。
|
||||
5. 对当前层级目录执行检查核对。
|
||||
6. 若检查未通过,则停止后续下钻,禁止生成任何正式目录 JSON 和目录版 Word。
|
||||
7. 若检查通过,则将当前层级节点记为“已批准可下钻节点”。
|
||||
8. 下一轮只能基于上一轮“已批准可下钻节点”生成其直接子节点,不得跨节点、跨层级展开。
|
||||
9. 重复“生成当前层级 -> 写出当前层级 JSON -> 检查核对 -> 标记已批准节点 -> 继续下钻”,直到当前层级所有节点都无法继续下钻为止。
|
||||
10. 只有在全部层级完成检查核对后,才允许执行最终目录导出。
|
||||
### 1. 理解资料
|
||||
|
||||
### 层级执行约束
|
||||
1. 读取 `rfp/` 主文档和当前项目候选材料。
|
||||
2. 提取并整理:
|
||||
- `废标/否决项`
|
||||
- `合规项`
|
||||
- `评分项`
|
||||
- 明确系统、子系统、设备、服务对象清单
|
||||
- 项目约束、风险约束、证据边界
|
||||
3. 形成一份简短的“目录依据摘要”,至少覆盖:
|
||||
- 主要评分点
|
||||
- 主要风险点
|
||||
- 明确对象清单
|
||||
- 是否存在必须保留的原始目录顺序
|
||||
4. 目录依据摘要必须落盘为 `work/outline_basis_summary.md`,供后续目录、正文和终检复核。
|
||||
|
||||
1. 生成一级目录时,一级节点不得携带任何二级以下 `children`。
|
||||
2. 生成某个一级节点的二级目录时,只允许补充该一级节点的直接 `children`,不得同时补充其三级节点。
|
||||
3. 生成某个二级节点的三级目录时,只允许补充该二级节点的直接 `children`,不得同时补充其四级节点。
|
||||
4. 任何节点在其父节点未通过当前轮检查前,不得进入下一层下钻。
|
||||
5. 未经当前轮检查通过的节点,不得写入最终目录。
|
||||
6. 不得以“先完整想好再拆回各层”的方式规避逐级流程;若最终目录中的下级节点没有对应上轮批准依据,视为流程违规。
|
||||
### 2. 生成候选目录
|
||||
|
||||
### 中间产物要求
|
||||
1. 一次性生成完整技术目录树。
|
||||
2. 默认先把技术目录做对;商务及其他目录作为第二优先级。
|
||||
3. 若用户只要求目录,默认先交技术目录;商务目录仅在用户明确需要时再继续补齐。
|
||||
4. 招标文件已有明确章节顺序时,优先继承原顺序和原命名;只有在过粗、缺层或缺承载位时才补层。
|
||||
5. 招标文件结构不完整时,可按项目类型默认骨架补缺,但只能补必要层级,不要堆砌模板化大纲。
|
||||
|
||||
1. 每一轮都必须落盘当前层级的中间 JSON,不得只在对话中描述。
|
||||
2. 中间 JSON 只承载本轮新增的直接子节点,不得混入更深层级内容。
|
||||
3. 若缺少任一层级的中间 JSON 或检查记录,则视为目录流程未完成,不得导出正式目录。
|
||||
### 3. 终检与补钻
|
||||
|
||||
### 调试输出要求
|
||||
1. 候选目录生成后,运行 `scripts/outline_check.py`。
|
||||
2. 若检查失败,只修补失败节点,不推倒重来。
|
||||
3. 修补后再检查一次。
|
||||
4. 默认最多连续修补两轮;若仍未通过,但失败点清晰且可继续收敛,可以继续修补,并必须说明原因。
|
||||
5. 检查通过后,再导出正式目录 JSON 和目录版 DOCX。
|
||||
6. `outline_export.py` 只用于最终正式导出,不承担目录生成和目录修补职责。
|
||||
7. 导出后的目录版 DOCX 还必须满足默认标书格式、标题编号连续、文档可直接作为投标目录使用。
|
||||
|
||||
1. 每完成一轮层级生成后,必须先输出一行简短调试信息,再进入检查步骤。
|
||||
2. 每完成一轮检查后,必须输出一行检查结果调试信息。
|
||||
3. 调试输出只允许描述当前轮次、当前节点、执行状态,不得展开成长篇解释。
|
||||
4. 调试输出应尽量固定格式,便于人工核对逐级流程是否被跳过。
|
||||
5. 若当前轮次包含多个节点,应逐个节点输出,不得用“本轮已完成”笼统代替。
|
||||
## 默认优先级
|
||||
|
||||
推荐格式:
|
||||
1. 技术目录优先于商务及其他目录。
|
||||
2. 评分点优先于普通合规项进入技术目录主干。
|
||||
3. 明确系统/子系统/设备/服务对象优先于抽象管理性标题进入下钻结构。
|
||||
4. 招标文件原始章节优先于默认骨架;默认骨架只用于补缺。
|
||||
|
||||
```text
|
||||
[DEBUG] level=1 node=ROOT action=generate status=done
|
||||
[DEBUG] level=1 node=ROOT action=check status=passed
|
||||
[DEBUG] level=2 node=八、服务方案 action=generate status=done
|
||||
[DEBUG] level=2 node=八、服务方案 action=check status=passed
|
||||
[DEBUG] level=3 node=8.3 VR智能培训中心建设方案 action=generate status=done
|
||||
[DEBUG] level=3 node=8.3 VR智能培训中心建设方案 action=check status=failed
|
||||
```
|
||||
## 例外规则
|
||||
|
||||
### 收敛与最终产物
|
||||
以下场景允许放宽“对象化下钻”要求,但不得放宽合规性和评分点覆盖要求:
|
||||
|
||||
1. “无法继续下钻”是指当前层级节点已经到达证据支持的最深安全层级,继续下钻只会制造空泛标题、伪细分或重复切面。
|
||||
2. 当招标文件、采购清单、技术参数表、分项报价表、供货一览表已经出现可识别的系统、子系统、设备或服务对象时,技术目录必须优先按这些已明示对象继续下钻。
|
||||
3. “无法继续下钻”必须基于本轮已检查通过的节点逐个判断,不得在上层轮次提前宣告整个目录收敛。
|
||||
4. 最终目录导出只能发生在循环结束之后,不得在中途为了查看效果提前生成目录版 Word。
|
||||
5. 最终导出产物为:
|
||||
- `work/final_outline_technical.json`
|
||||
- `work/final_outline_business_other.json`
|
||||
- `final/技术标_目录版.docx`
|
||||
- `final/商务及其他_目录版.docx`
|
||||
1. 招标文件明确固定了章节名称、顺序或颗粒度,且不宜改写。
|
||||
2. 项目主类型为运维服务、驻场服务、咨询培训等服务型项目,评分重点明确落在组织、流程、SLA、响应、考核等服务切面。
|
||||
|
||||
### Mermaid 流程图
|
||||
命中例外时:
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[整理目录结构边界<br/>建立评分点 风险点 证据点] --> B[生成当前层级目录<br/>统一使用 heading(level/text/children)]
|
||||
B --> C[写出当前层级中间 JSON]
|
||||
C --> D[执行目录检查核对]
|
||||
D --> E{检查是否通过}
|
||||
E -- 否 --> F[停止后续下钻<br/>禁止生成正式目录 JSON 和目录版 Word]
|
||||
E -- 是 --> G[标记已批准可下钻节点]
|
||||
G --> H{是否还能继续下钻}
|
||||
H -- 是 --> I[仅对已批准节点生成下一层直接子节点]
|
||||
I --> C
|
||||
H -- 否 --> J[执行最终目录导出]
|
||||
J --> K[生成 work/final_outline_technical.json]
|
||||
J --> L[生成 work/final_outline_business_other.json]
|
||||
J --> M[生成 final/技术标_目录版.docx]
|
||||
J --> N[生成 final/商务及其他_目录版.docx]
|
||||
```
|
||||
1. 仍然禁止把 `技术方案`、`实施方案` 等抽象标题直接作为叶子节点。
|
||||
2. 可以用组织、流程、SLA、响应、保障、考核等服务切面完成下钻,不强制要求模块、子系统、设备类对象节点。
|
||||
3. 例外只对命中的节点或分支生效,不得整份目录一起放宽。
|
||||
4. 必须在对应节点上显式标记例外策略,并在 `work/outline_basis_summary.md` 中写明“例外原因”,说明是固定目录还是服务型项目。
|
||||
|
||||
## 拆分规则
|
||||
|
||||
@ -100,73 +80,92 @@ flowchart TD
|
||||
2. 商务及其他目录保留 business/other 目录,并在技术内容应出现的位置保留技术占位。
|
||||
3. 若招标文件明确规定分册、分标或顺序,技术占位必须出现在原规定位置。
|
||||
4. 若招标文件未明确规定位置,则默认在 unified/canonical outline 中技术部分入口位置保留一个一级占位章节。
|
||||
6. 商务及其他中的技术占位不能成为技术门禁放行依据。
|
||||
5. 商务及其他中的技术占位不能成为技术门禁放行依据。
|
||||
6. 商务及其他中的技术占位只能是单节点占位,不得展开成技术正文树。
|
||||
|
||||
## 抽象标题处理与下钻强制约束(核心规则)
|
||||
## 抽象标题处理与下钻强制约束
|
||||
|
||||
<Constraint>
|
||||
以下标题列为默认非法终点,绝对禁止作为技术标叶子节点直接写正文:
|
||||
[技术方案, 服务方案, 实施方案, 服务保障及措施, 售后服务和质保期服务计划, 项目理解, 解决方案, 系统设计, 平台建设方案, 系统建设方案, 总体方案, 培训方案, 运维方案]
|
||||
</Constraint>
|
||||
|
||||
当遇到上述标题时,必须强制下钻。下钻不能是单一的,不能只有一个子标题,必须形成完整的结构化切面。
|
||||
必须包含但不限于以下维度的同级子标题:
|
||||
遇到上述标题时,必须继续下钻。下钻遵循以下最小要求:
|
||||
|
||||
- 原则与目标
|
||||
- 架构与设计
|
||||
- 模块与内容
|
||||
- 流程与机制
|
||||
- 计划与验收
|
||||
- 保障与风控
|
||||
1. 不能只有一个直接子标题。
|
||||
2. 不能全部是“原则、目标、保障、计划”这类管理性切面。
|
||||
3. 至少出现一个对象化子标题,例如:
|
||||
- 模块
|
||||
- 子系统
|
||||
- 设备
|
||||
- 接口
|
||||
- 功能单元
|
||||
- 服务项
|
||||
4. 同一父节点下不要堆出语义重复的小标题,例如:
|
||||
- 实施方案
|
||||
- 实施计划
|
||||
- 实施步骤
|
||||
|
||||
<Examples>
|
||||
错误示例(严禁此类浅层结构,发现即打回):
|
||||
当招标文件、采购清单、技术参数表、分项报价表、供货一览表已经出现可识别的系统、子系统、设备或服务对象时,技术目录必须优先按这些已明示对象继续下钻。
|
||||
|
||||
5. 技术方案
|
||||
5.1 方案概述
|
||||
5.2 实施方案
|
||||
5.3 售后服务
|
||||
## 收敛规则
|
||||
|
||||
正确示例(技术标必须模仿此颗粒度与深度继续下钻):
|
||||
“可以停止下钻”必须同时满足:
|
||||
|
||||
5. 技术方案
|
||||
5.1 总体设计方案
|
||||
5.1.1 总体设计原则(安全性、扩展性、规范性等)
|
||||
5.1.2 总体架构设计(分层架构图解)
|
||||
5.1.3 关键技术路线选型
|
||||
5.2 核心系统建设内容(下钻到具体模块)
|
||||
5.2.1 数据中心模块功能实现
|
||||
5.2.2 接口与数据流向设计
|
||||
5.3 实施与交付计划
|
||||
5.3.1 实施阶段划分与里程碑
|
||||
5.3.2 资源配置与人员安排
|
||||
5.4 技术难点分析与应对预案
|
||||
</Examples>
|
||||
1. 当前节点已经能直接承载正文。
|
||||
2. 继续下钻只会制造空泛标题、伪细分或重复切面。
|
||||
3. 评分点、风险点、证据点已经都有承载位。
|
||||
4. 若存在明确对象清单,目录已经按对象展开到安全层级。
|
||||
|
||||
商务及其他中的技术占位不适用上述“继续下钻”要求,但只能是占位,不得承载技术正文。
|
||||
## 终检清单
|
||||
|
||||
## 强制自检清单 (Pre-Output Checklist)
|
||||
|
||||
在物理生成双正式目录之前,AI 或主代理必须在对话框中输出以下自我检查单,所有项为“是”才可调用写入工具:
|
||||
在调用写入工具前,AI 或主代理必须先完成以下自检,全部满足后才允许导出:
|
||||
|
||||
```text
|
||||
【目录深度强制自检】
|
||||
1. 是否存在直接以“技术方案”或“实施方案”作为叶子节点的章节?(要求:否)
|
||||
2. 技术类章节是否已经下钻到第三级或第四级?(要求:是)
|
||||
3. 技术方案下,是否同时包含了[原则]、[架构]、[内容/模块]?(要求:是)
|
||||
4. 所有的评分点是否都已在目录中体现?(要求:是)
|
||||
5. 商务及其他中的技术节点是否只保留占位,不承载技术正文?(要求:是)
|
||||
6. 招标文件、采购清单、技术参数表、分项报价表、供货一览表已经出现可识别的系统/子系统/设备清单时,技术目录是否已经按照要求下钻?(要求:是)
|
||||
【目录终检】
|
||||
1. 是否不存在直接以“技术方案”“实施方案”等抽象标题作为叶子节点?(要求:是)
|
||||
2. 技术目录主干是否至少到第三级?(要求:是)
|
||||
3. 抽象技术标题下是否至少出现一个对象化子节点?(要求:是)
|
||||
4. 是否不存在“实施方案 / 实施计划 / 实施步骤”这类重复切面堆叠?(要求:是)
|
||||
5. 所有主要评分点是否都已在目录中有承载位?(要求:是)
|
||||
6. 商务及其他中的技术节点是否只保留占位,不承载技术正文?(要求:是)
|
||||
7. 目录标题是否自然、专业,适合直接进入投标 Word 文档?(要求:是)
|
||||
8. 导出的目录版 DOCX 是否具备统一样式和连续标题编号?(要求:是)
|
||||
```
|
||||
|
||||
## 定稿门禁
|
||||
目录阶段最终回复必须额外附两段简短说明:
|
||||
|
||||
正式双目录必须同时满足:
|
||||
1. `评分点覆盖`
|
||||
- 列出主要评分点对应到哪个目录章节。
|
||||
2. `推断章节`
|
||||
- 列出哪些章节不是招标文件明示,而是按常规补出的。
|
||||
3. `例外原因`
|
||||
- 若使用了固定目录或服务型项目例外,简短说明原因;未使用则不写。
|
||||
|
||||
1. 第一流程中评分点、风险点、证据点都有承载位。
|
||||
2. 技术标中的抽象标题已下钻,直到能承载正文为止。
|
||||
3. 商务及其他中的技术部分只保留占位,位置正确。
|
||||
4. 显式章节、附表、附件承载位没有被无故遗漏。
|
||||
5. 标题层级、编号、归属关系合法,防止自动或手动编号混淆。
|
||||
6. 最终目录中的每个节点都能追溯到某一轮已通过检查的层级结果,不存在跳过当前层级检查直接写入的后代节点。
|
||||
7. 不得用“目录已经想好”或“逻辑上已定稿”代替文件存在检查。
|
||||
## 最终产物
|
||||
|
||||
技术目录定稿后,允许生成:
|
||||
|
||||
- `work/outline_basis_summary.md`
|
||||
- `work/final_outline_technical.json`
|
||||
- `final/技术标_目录版.docx`
|
||||
|
||||
当商务及其他目录也已定稿时,额外生成:
|
||||
|
||||
- `work/final_outline_business_other.json`
|
||||
- `final/商务及其他_目录版.docx`
|
||||
|
||||
## Mermaid 流程图
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[整理目录依据摘要<br/>评分点 风险点 对象清单] --> B[一次性生成候选技术目录树]
|
||||
B --> C[执行 outline_check 终检]
|
||||
C --> D{检查是否通过}
|
||||
D -- 否 --> E[仅修补失败节点]
|
||||
E --> F[再次执行 outline_check]
|
||||
F --> G{第二轮是否通过}
|
||||
G -- 否 --> H[停止正式导出并报告失败原因]
|
||||
G -- 是 --> I[导出正式目录 JSON 和目录版 DOCX]
|
||||
D -- 是 --> I
|
||||
```
|
||||
|
||||
@ -57,3 +57,22 @@
|
||||
1. 先锁定 `废标/否决项`,确保不漏项。
|
||||
2. 再补齐 `合规项`,确保正式交付结构完整。
|
||||
3. 最后围绕 `评分项` 优化目录颗粒度、技术展开和证据呈现。
|
||||
|
||||
## 项目类型快速归类
|
||||
|
||||
目录补缺前,先判断项目主类型。只选一个主类型即可,招标文件有明确目录时始终以招标文件为先,以下骨架只用于补缺,不用于覆盖原结构。
|
||||
|
||||
1. `软件平台类`
|
||||
- 默认骨架优先包含:项目理解、总体架构、功能模块、接口与数据、实施部署、测试验收、培训运维、安全保障。
|
||||
2. `系统集成类`
|
||||
- 默认骨架优先包含:建设范围、总体集成架构、软硬件配置、子系统建设、安装部署、联调测试、培训交付、运维保障。
|
||||
3. `运维服务类`
|
||||
- 默认骨架优先包含:服务理解、服务组织、服务内容、服务流程、SLA/响应机制、巡检维护、应急保障、考核验收。
|
||||
4. `硬件供货/设备建设类`
|
||||
- 默认骨架优先包含:供货范围、设备选型、技术参数响应、安装实施、联调测试、培训交付、质保售后、安全与风险控制。
|
||||
|
||||
判断原则:
|
||||
|
||||
- 先看招标范围和评分办法,再定主类型。
|
||||
- 若同时包含软件、硬件、服务,以评分权重最高的主线作为主类型。
|
||||
- 默认骨架只补一级、二级结构;更深层级仍应由评分点和对象清单驱动。
|
||||
|
||||
Binary file not shown.
BIN
scripts/__pycache__/docx_create.cpython-311.pyc
Normal file
BIN
scripts/__pycache__/docx_create.cpython-311.pyc
Normal file
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
BIN
scripts/__pycache__/outline_export.cpython-311.pyc
Normal file
BIN
scripts/__pycache__/outline_export.cpython-311.pyc
Normal file
Binary file not shown.
@ -85,7 +85,13 @@ def main() -> None:
|
||||
if args.render_check:
|
||||
output_docx = Path(report["output_docx"]).resolve()
|
||||
render_dir = Path(args.render_dir).resolve() if args.render_dir else output_docx.parent / f"{output_docx.stem}_render"
|
||||
report["render"] = render_docx(output_docx, render_dir)
|
||||
report["render"] = render_docx(
|
||||
output_docx,
|
||||
render_dir,
|
||||
docx_style_profile=report.get("docx_style_profile", "default_bid"),
|
||||
numbering_mode=report.get("numbering_mode", "explicit_text"),
|
||||
document_kind=patch_data.get("document_kind", "assembled"),
|
||||
)
|
||||
write_json(Path(args.report).resolve(), report)
|
||||
return
|
||||
|
||||
|
||||
@ -4,6 +4,7 @@ import json
|
||||
import re
|
||||
import shutil
|
||||
import subprocess
|
||||
from collections import defaultdict
|
||||
from dataclasses import dataclass
|
||||
from hashlib import sha1
|
||||
from pathlib import Path
|
||||
@ -11,7 +12,9 @@ from typing import Any, Iterator
|
||||
|
||||
from docx import Document
|
||||
from docx.document import Document as DocxDocument
|
||||
from docx.enum.text import WD_PARAGRAPH_ALIGNMENT
|
||||
from docx.oxml import OxmlElement
|
||||
from docx.shared import Pt
|
||||
from docx.table import Table, _Cell
|
||||
from docx.text.paragraph import Paragraph
|
||||
|
||||
@ -27,6 +30,60 @@ except ImportError: # pragma: no cover
|
||||
|
||||
NAMESPACES = {"w": "http://schemas.openxmlformats.org/wordprocessingml/2006/main"}
|
||||
TEXT_WINDOW_DEFAULT = 40
|
||||
DEFAULT_DOCX_STYLE_PROFILE = "default_bid"
|
||||
DEFAULT_NUMBERING_MODE = "explicit_text"
|
||||
DEFAULT_DOCUMENT_KIND = "generic"
|
||||
HEADING_NUMBER_PATTERN = re.compile(r"^(?P<number>\d+(?:\.\d+)*)\s+(?P<title>.+)$")
|
||||
LEGACY_HEADING_PREFIX_PATTERN = re.compile(r"^(?:\d+(?:\.\d+)*|[一二三四五六七八九十]+)[、\..]?\s*")
|
||||
CAPTION_PATTERN = re.compile(r"^(图|表|附件)\s*(\d+)-(\d+)\s+(.+)$")
|
||||
PLACEHOLDER_PATTERN = re.compile(r"(占位|待补充|待提供|待替换|替换提示|TODO|技术转引)")
|
||||
|
||||
DEFAULT_BID_STYLE_SPEC: dict[str, Any] = {
|
||||
"normal": {
|
||||
"font_name": "宋体",
|
||||
"font_size": 12,
|
||||
"bold": False,
|
||||
"first_line_indent": 24,
|
||||
"line_spacing": 1.5,
|
||||
"space_before": 0,
|
||||
"space_after": 0,
|
||||
},
|
||||
"headings": {
|
||||
1: {
|
||||
"font_name": "黑体",
|
||||
"font_size": 15,
|
||||
"bold": True,
|
||||
"space_before": 18,
|
||||
"space_after": 12,
|
||||
},
|
||||
2: {
|
||||
"font_name": "黑体",
|
||||
"font_size": 14,
|
||||
"bold": True,
|
||||
"space_before": 12,
|
||||
"space_after": 6,
|
||||
},
|
||||
3: {
|
||||
"font_name": "黑体",
|
||||
"font_size": 12,
|
||||
"bold": True,
|
||||
"space_before": 6,
|
||||
"space_after": 6,
|
||||
},
|
||||
4: {
|
||||
"font_name": "楷体",
|
||||
"font_size": 12,
|
||||
"bold": False,
|
||||
"space_before": 6,
|
||||
"space_after": 3,
|
||||
},
|
||||
},
|
||||
"table": {
|
||||
"font_name": "宋体",
|
||||
"font_size": 10.5,
|
||||
"header_bold": True,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
@dataclass
|
||||
@ -101,17 +158,392 @@ def normalize_text(value: str) -> str:
|
||||
return re.sub(r"\s+", " ", value or "").strip()
|
||||
|
||||
|
||||
def get_style_spec(docx_style_profile: str) -> dict[str, Any]:
|
||||
if docx_style_profile != DEFAULT_DOCX_STYLE_PROFILE:
|
||||
raise QueryError(f"unsupported docx_style_profile: {docx_style_profile}")
|
||||
return DEFAULT_BID_STYLE_SPEC
|
||||
|
||||
|
||||
def resolve_generation_options(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
return {
|
||||
"docx_style_profile": str(payload.get("docx_style_profile", DEFAULT_DOCX_STYLE_PROFILE)),
|
||||
"numbering_mode": str(payload.get("numbering_mode", DEFAULT_NUMBERING_MODE)),
|
||||
"template_docx": payload.get("template_docx"),
|
||||
"document_kind": str(payload.get("document_kind", DEFAULT_DOCUMENT_KIND)),
|
||||
}
|
||||
|
||||
|
||||
def _get_xml_element(target: Any) -> Any | None:
|
||||
return getattr(target, "_element", None) or getattr(target, "element", None)
|
||||
|
||||
|
||||
def _set_font_family(target: Any, font_name: str) -> None:
|
||||
target.font.name = font_name
|
||||
if not qn:
|
||||
return
|
||||
element = _get_xml_element(target)
|
||||
if element is None:
|
||||
return
|
||||
r_pr = getattr(element, "rPr", None)
|
||||
if r_pr is None:
|
||||
r_pr = OxmlElement("w:rPr")
|
||||
element.insert(0, r_pr)
|
||||
r_fonts = getattr(r_pr, "rFonts", None)
|
||||
if r_fonts is None:
|
||||
r_fonts = OxmlElement("w:rFonts")
|
||||
r_pr.insert(0, r_fonts)
|
||||
for attr in ("w:ascii", "w:hAnsi", "w:eastAsia"):
|
||||
r_fonts.set(qn(attr), font_name)
|
||||
|
||||
|
||||
def apply_run_font(target_run: Any, *, font_name: str, font_size: float, bold: bool | None = None) -> None:
|
||||
_set_font_family(target_run, font_name)
|
||||
target_run.font.size = Pt(font_size)
|
||||
if bold is not None:
|
||||
target_run.bold = bold
|
||||
|
||||
|
||||
def configure_style(style: Any, *, font_name: str, font_size: float, bold: bool, space_before: float = 0, space_after: float = 0, first_line_indent: float | None = None, line_spacing: float | None = None) -> None:
|
||||
_set_font_family(style, font_name)
|
||||
style.font.size = Pt(font_size)
|
||||
style.font.bold = bold
|
||||
paragraph_format = style.paragraph_format
|
||||
paragraph_format.space_before = Pt(space_before)
|
||||
paragraph_format.space_after = Pt(space_after)
|
||||
if first_line_indent is not None:
|
||||
paragraph_format.first_line_indent = Pt(first_line_indent)
|
||||
if line_spacing is not None:
|
||||
paragraph_format.line_spacing = line_spacing
|
||||
|
||||
|
||||
def initialize_default_bid_styles(document: Document, docx_style_profile: str) -> dict[str, Any]:
|
||||
spec = get_style_spec(docx_style_profile)
|
||||
styles = document.styles
|
||||
configure_style(styles["Normal"], **spec["normal"])
|
||||
try:
|
||||
configure_style(styles["List Bullet"], **spec["normal"])
|
||||
except KeyError:
|
||||
pass
|
||||
for level, heading_spec in spec["headings"].items():
|
||||
configure_style(styles[f"Heading {level}"], **heading_spec)
|
||||
return {
|
||||
"status": "pass",
|
||||
"profile": docx_style_profile,
|
||||
"summary": {
|
||||
"heading_numbering": "1 / 1.1 / 1.1.1 / 1.1.1.1",
|
||||
"normal_font": spec["normal"]["font_name"],
|
||||
"normal_font_size": spec["normal"]["font_size"],
|
||||
},
|
||||
"issues": [],
|
||||
}
|
||||
|
||||
|
||||
def strip_heading_prefix(text: str) -> str:
|
||||
normalized = normalize_text(text)
|
||||
numbered = HEADING_NUMBER_PATTERN.match(normalized)
|
||||
if numbered:
|
||||
return numbered.group("title")
|
||||
return LEGACY_HEADING_PREFIX_PATTERN.sub("", normalized, count=1).strip()
|
||||
|
||||
|
||||
def replace_paragraph_text(paragraph: Paragraph, text: str) -> None:
|
||||
existing_runs = list(paragraph.runs)
|
||||
source_run = existing_runs[0] if existing_runs else None
|
||||
clear_paragraph(paragraph)
|
||||
new_run = paragraph.add_run(text)
|
||||
if source_run is not None:
|
||||
clone_run_format(source_run, new_run)
|
||||
|
||||
|
||||
def apply_heading_numbering(document: Document, numbering_mode: str) -> None:
|
||||
if numbering_mode != DEFAULT_NUMBERING_MODE:
|
||||
raise QueryError(f"unsupported numbering_mode: {numbering_mode}")
|
||||
counters = [0] * 9
|
||||
for paragraph in document.paragraphs:
|
||||
style_name = paragraph.style.name if paragraph.style else None
|
||||
level = heading_level_for_style(style_name)
|
||||
if not level:
|
||||
continue
|
||||
counters[level - 1] += 1
|
||||
for index in range(level, len(counters)):
|
||||
counters[index] = 0
|
||||
prefix = ".".join(str(value) for value in counters[:level] if value)
|
||||
base_text = strip_heading_prefix(paragraph.text)
|
||||
replace_paragraph_text(paragraph, f"{prefix} {base_text}".strip())
|
||||
|
||||
|
||||
def apply_paragraph_profile(paragraph: Paragraph, *, font_name: str, font_size: float, bold: bool, first_line_indent: float | None = None, line_spacing: float | None = None, space_before: float | None = None, space_after: float | None = None) -> None:
|
||||
if first_line_indent is not None:
|
||||
paragraph.paragraph_format.first_line_indent = Pt(first_line_indent)
|
||||
if line_spacing is not None:
|
||||
paragraph.paragraph_format.line_spacing = line_spacing
|
||||
if space_before is not None:
|
||||
paragraph.paragraph_format.space_before = Pt(space_before)
|
||||
if space_after is not None:
|
||||
paragraph.paragraph_format.space_after = Pt(space_after)
|
||||
for run in paragraph.runs:
|
||||
apply_run_font(run, font_name=font_name, font_size=font_size, bold=bold)
|
||||
|
||||
|
||||
def apply_table_profile(table: Table, docx_style_profile: str) -> None:
|
||||
table_spec = get_style_spec(docx_style_profile)["table"]
|
||||
try:
|
||||
table.style = "Table Grid"
|
||||
except KeyError:
|
||||
pass
|
||||
for row_index, row in enumerate(table.rows):
|
||||
for cell in row.cells:
|
||||
for paragraph in cell.paragraphs:
|
||||
apply_paragraph_profile(
|
||||
paragraph,
|
||||
font_name=table_spec["font_name"],
|
||||
font_size=table_spec["font_size"],
|
||||
bold=bool(row_index == 0 and table_spec["header_bold"]),
|
||||
first_line_indent=0,
|
||||
line_spacing=1.0,
|
||||
space_before=0,
|
||||
space_after=0,
|
||||
)
|
||||
|
||||
|
||||
def apply_document_profile(document: Document, docx_style_profile: str) -> None:
|
||||
spec = get_style_spec(docx_style_profile)
|
||||
for paragraph in document.paragraphs:
|
||||
style_name = paragraph.style.name if paragraph.style else None
|
||||
level = heading_level_for_style(style_name)
|
||||
if level:
|
||||
heading_spec = spec["headings"].get(level, spec["headings"][4])
|
||||
paragraph.alignment = WD_PARAGRAPH_ALIGNMENT.LEFT
|
||||
paragraph.paragraph_format.first_line_indent = Pt(0)
|
||||
paragraph.paragraph_format.space_before = Pt(heading_spec["space_before"])
|
||||
paragraph.paragraph_format.space_after = Pt(heading_spec["space_after"])
|
||||
paragraph.paragraph_format.keep_with_next = True
|
||||
apply_paragraph_profile(
|
||||
paragraph,
|
||||
font_name=heading_spec["font_name"],
|
||||
font_size=heading_spec["font_size"],
|
||||
bold=heading_spec["bold"],
|
||||
first_line_indent=0,
|
||||
line_spacing=1.0,
|
||||
space_before=heading_spec["space_before"],
|
||||
space_after=heading_spec["space_after"],
|
||||
)
|
||||
continue
|
||||
apply_paragraph_profile(
|
||||
paragraph,
|
||||
font_name=spec["normal"]["font_name"],
|
||||
font_size=spec["normal"]["font_size"],
|
||||
bold=spec["normal"]["bold"],
|
||||
first_line_indent=spec["normal"]["first_line_indent"],
|
||||
line_spacing=spec["normal"]["line_spacing"],
|
||||
space_before=spec["normal"]["space_before"],
|
||||
space_after=spec["normal"]["space_after"],
|
||||
)
|
||||
for table in document.tables:
|
||||
apply_table_profile(table, docx_style_profile)
|
||||
|
||||
|
||||
def remove_initial_blank_paragraph(document: Document) -> None:
|
||||
if len(document.paragraphs) != 1:
|
||||
return
|
||||
paragraph = document.paragraphs[0]
|
||||
if normalize_text(paragraph.text):
|
||||
return
|
||||
delete_block(paragraph)
|
||||
|
||||
|
||||
def validate_format_profile(document: Document, docx_style_profile: str) -> dict[str, Any]:
|
||||
spec = get_style_spec(docx_style_profile)
|
||||
issues: list[str] = []
|
||||
styles = document.styles
|
||||
for style_name, expected in (
|
||||
("Normal", spec["normal"]),
|
||||
("Heading 1", spec["headings"][1]),
|
||||
("Heading 2", spec["headings"][2]),
|
||||
("Heading 3", spec["headings"][3]),
|
||||
("Heading 4", spec["headings"][4]),
|
||||
):
|
||||
style = styles[style_name]
|
||||
actual_size = style.font.size.pt if style.font.size is not None else None
|
||||
if style.font.name != expected["font_name"]:
|
||||
issues.append(f"{style_name} font should be {expected['font_name']}, got {style.font.name!r}")
|
||||
if actual_size is None or abs(actual_size - expected["font_size"]) > 0.2:
|
||||
issues.append(f"{style_name} size should be {expected['font_size']}, got {actual_size!r}")
|
||||
return {
|
||||
"status": "pass" if not issues else "fail",
|
||||
"profile": docx_style_profile,
|
||||
"issues": issues,
|
||||
}
|
||||
|
||||
|
||||
def validate_heading_numbering(document: Document, numbering_mode: str) -> dict[str, Any]:
|
||||
if numbering_mode != DEFAULT_NUMBERING_MODE:
|
||||
return {
|
||||
"status": "fail",
|
||||
"mode": numbering_mode,
|
||||
"checked_headings": 0,
|
||||
"issues": [f"unsupported numbering_mode: {numbering_mode}"],
|
||||
}
|
||||
counters = [0] * 9
|
||||
issues: list[str] = []
|
||||
checked = 0
|
||||
for paragraph in document.paragraphs:
|
||||
style_name = paragraph.style.name if paragraph.style else None
|
||||
level = heading_level_for_style(style_name)
|
||||
if not level:
|
||||
continue
|
||||
checked += 1
|
||||
counters[level - 1] += 1
|
||||
for index in range(level, len(counters)):
|
||||
counters[index] = 0
|
||||
expected = ".".join(str(value) for value in counters[:level] if value)
|
||||
match = HEADING_NUMBER_PATTERN.match(normalize_text(paragraph.text))
|
||||
actual = match.group("number") if match else None
|
||||
if actual != expected:
|
||||
issues.append(f"{paragraph.text!r} should use heading number {expected}")
|
||||
return {
|
||||
"status": "pass" if not issues else "fail",
|
||||
"mode": numbering_mode,
|
||||
"checked_headings": checked,
|
||||
"issues": issues,
|
||||
}
|
||||
|
||||
|
||||
def validate_caption_numbering(document: Document) -> dict[str, Any]:
|
||||
counters: dict[tuple[str, int], int] = defaultdict(int)
|
||||
issues: list[str] = []
|
||||
caption_count = 0
|
||||
for paragraph in document.paragraphs:
|
||||
match = CAPTION_PATTERN.match(normalize_text(paragraph.text))
|
||||
if not match:
|
||||
continue
|
||||
caption_count += 1
|
||||
kind, chapter_text, index_text, _ = match.groups()
|
||||
chapter = int(chapter_text)
|
||||
index = int(index_text)
|
||||
counters[(kind, chapter)] += 1
|
||||
if counters[(kind, chapter)] != index:
|
||||
issues.append(f"{paragraph.text!r} caption index should be {counters[(kind, chapter)]}")
|
||||
return {
|
||||
"status": "pass" if not issues else "fail",
|
||||
"caption_count": caption_count,
|
||||
"issues": issues,
|
||||
}
|
||||
|
||||
|
||||
def document_has_toc(document: Document) -> bool:
|
||||
body = document.element.body
|
||||
for element in body.iter():
|
||||
if element.tag.endswith("}instrText") and "TOC" in "".join(element.itertext()):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def validate_toc(document: Document, document_kind: str) -> dict[str, Any]:
|
||||
has_toc = document_has_toc(document)
|
||||
if has_toc:
|
||||
return {"status": "pass", "has_toc": True, "issues": []}
|
||||
if document_kind == "outline":
|
||||
return {
|
||||
"status": "pass",
|
||||
"has_toc": False,
|
||||
"issues": ["outline documents can use heading numbering directly as visible目录内容"],
|
||||
}
|
||||
return {
|
||||
"status": "pass",
|
||||
"has_toc": False,
|
||||
"issues": ["TOC field not found; current workflow allows user to insert or update TOC in Word"],
|
||||
}
|
||||
|
||||
|
||||
def collect_placeholder_hits(document: Document) -> list[str]:
|
||||
hits: list[str] = []
|
||||
for paragraph in document.paragraphs:
|
||||
text = normalize_text(paragraph.text)
|
||||
if text and PLACEHOLDER_PATTERN.search(text):
|
||||
hits.append(text)
|
||||
for table in document.tables:
|
||||
for row in table.rows:
|
||||
for cell in row.cells:
|
||||
text = normalize_text(cell.text)
|
||||
if text and PLACEHOLDER_PATTERN.search(text):
|
||||
hits.append(text)
|
||||
return hits
|
||||
|
||||
|
||||
def validate_placeholders(document: Document, document_kind: str) -> dict[str, Any]:
|
||||
hits = collect_placeholder_hits(document)
|
||||
allow_hits = document_kind == "outline"
|
||||
status = "pass" if allow_hits or not hits else "fail"
|
||||
return {
|
||||
"status": status,
|
||||
"placeholder_count": len(hits),
|
||||
"issues": hits[:20],
|
||||
}
|
||||
|
||||
|
||||
def build_acceptance_checks(*, format_profile: dict[str, Any], numbering_validation: dict[str, Any], caption_validation: dict[str, Any], toc_validation: dict[str, Any], placeholder_validation: dict[str, Any], render_status: str | None = None) -> dict[str, Any]:
|
||||
checks = [
|
||||
{"name": "format_profile", "status": format_profile["status"]},
|
||||
{"name": "numbering_validation", "status": numbering_validation["status"]},
|
||||
{"name": "caption_validation", "status": caption_validation["status"]},
|
||||
{"name": "toc_validation", "status": toc_validation["status"]},
|
||||
{"name": "placeholder_validation", "status": placeholder_validation["status"]},
|
||||
]
|
||||
if render_status is not None:
|
||||
checks.append(
|
||||
{
|
||||
"name": "render_validation",
|
||||
"status": "pass" if render_status == "ok" else ("warn" if render_status == "render_skipped" else "fail"),
|
||||
}
|
||||
)
|
||||
overall_status = "fail" if any(item["status"] == "fail" for item in checks) else "pass"
|
||||
return {
|
||||
"status": overall_status,
|
||||
"checks": checks,
|
||||
}
|
||||
|
||||
|
||||
def inspect_document_quality(docx_path: Path, *, docx_style_profile: str, numbering_mode: str, document_kind: str, render_status: str | None = None) -> dict[str, Any]:
|
||||
document = Document(str(docx_path))
|
||||
format_profile = validate_format_profile(document, docx_style_profile)
|
||||
numbering_validation = validate_heading_numbering(document, numbering_mode)
|
||||
caption_validation = validate_caption_numbering(document)
|
||||
toc_validation = validate_toc(document, document_kind)
|
||||
placeholder_validation = validate_placeholders(document, document_kind)
|
||||
acceptance_checks = build_acceptance_checks(
|
||||
format_profile=format_profile,
|
||||
numbering_validation=numbering_validation,
|
||||
caption_validation=caption_validation,
|
||||
toc_validation=toc_validation,
|
||||
placeholder_validation=placeholder_validation,
|
||||
render_status=render_status,
|
||||
)
|
||||
return {
|
||||
"format_profile": format_profile,
|
||||
"numbering_validation": numbering_validation,
|
||||
"caption_validation": caption_validation,
|
||||
"toc_validation": toc_validation,
|
||||
"placeholder_validation": placeholder_validation,
|
||||
"acceptance_checks": acceptance_checks,
|
||||
}
|
||||
|
||||
|
||||
def create_docx_document(spec_data: dict[str, Any]) -> dict[str, Any]:
|
||||
output_docx = Path(spec_data["output_docx"]).resolve()
|
||||
blocks = spec_data.get("blocks", [])
|
||||
if not isinstance(blocks, list):
|
||||
raise QueryError("blocks must be a list")
|
||||
options = resolve_generation_options(spec_data)
|
||||
template_docx = options["template_docx"]
|
||||
|
||||
output_docx.parent.mkdir(parents=True, exist_ok=True)
|
||||
document = Document()
|
||||
document = Document(str(Path(template_docx).resolve())) if template_docx else Document()
|
||||
title = spec_data.get("title")
|
||||
if title:
|
||||
document.core_properties.title = str(title)
|
||||
initialize_default_bid_styles(document, options["docx_style_profile"])
|
||||
remove_initial_blank_paragraph(document)
|
||||
|
||||
block_reports: list[dict[str, Any]] = []
|
||||
|
||||
@ -125,7 +557,9 @@ def create_docx_document(spec_data: dict[str, Any]) -> dict[str, Any]:
|
||||
raise QueryError(f"block {'.'.join(str(part) for part in index_path)} heading level must be between 1 and 9")
|
||||
text = str(block.get("text", ""))
|
||||
paragraph = document.add_paragraph(style=f"Heading {level}")
|
||||
paragraph.add_run(text)
|
||||
run = paragraph.add_run(text)
|
||||
heading_spec = get_style_spec(options["docx_style_profile"])["headings"].get(level, get_style_spec(options["docx_style_profile"])["headings"][4])
|
||||
apply_run_font(run, font_name=heading_spec["font_name"], font_size=heading_spec["font_size"], bold=heading_spec["bold"])
|
||||
block_reports.append({"index": ".".join(str(part) for part in index_path), "type": block_type, "text": summarize_text(text), "level": level})
|
||||
children = block.get("children", [])
|
||||
if children and not isinstance(children, list):
|
||||
@ -143,7 +577,9 @@ def create_docx_document(spec_data: dict[str, Any]) -> dict[str, Any]:
|
||||
paragraph.style = str(style_name)
|
||||
except KeyError:
|
||||
pass
|
||||
paragraph.add_run(text)
|
||||
run = paragraph.add_run(text)
|
||||
normal_spec = get_style_spec(options["docx_style_profile"])["normal"]
|
||||
apply_run_font(run, font_name=normal_spec["font_name"], font_size=normal_spec["font_size"], bold=normal_spec["bold"])
|
||||
block_reports.append({"index": ".".join(str(part) for part in index_path), "type": block_type, "text": summarize_text(text)})
|
||||
return
|
||||
if block_type == "list":
|
||||
@ -157,7 +593,9 @@ def create_docx_document(spec_data: dict[str, Any]) -> dict[str, Any]:
|
||||
paragraph.style = style_name
|
||||
except KeyError:
|
||||
pass
|
||||
paragraph.add_run(str(item))
|
||||
run = paragraph.add_run(str(item))
|
||||
normal_spec = get_style_spec(options["docx_style_profile"])["normal"]
|
||||
apply_run_font(run, font_name=normal_spec["font_name"], font_size=normal_spec["font_size"], bold=normal_spec["bold"])
|
||||
block_reports.append({"index": ".".join(str(part) for part in index_path), "type": block_type, "item_count": len(items)})
|
||||
return
|
||||
if block_type == "table":
|
||||
@ -177,6 +615,7 @@ def create_docx_document(spec_data: dict[str, Any]) -> dict[str, Any]:
|
||||
row = table.add_row()
|
||||
for cell_index, value in enumerate(row_values):
|
||||
row.cells[cell_index].text = str(value)
|
||||
apply_table_profile(table, options["docx_style_profile"])
|
||||
block_reports.append({"index": ".".join(str(part) for part in index_path), "type": block_type, "row_count": len(rows), "column_count": len(rows[0])})
|
||||
return
|
||||
if block_type == "page_break":
|
||||
@ -187,14 +626,27 @@ def create_docx_document(spec_data: dict[str, Any]) -> dict[str, Any]:
|
||||
|
||||
for index, block in enumerate(blocks):
|
||||
render_block(block, [index])
|
||||
apply_heading_numbering(document, options["numbering_mode"])
|
||||
apply_document_profile(document, options["docx_style_profile"])
|
||||
document.save(str(output_docx))
|
||||
final_index = index_document(output_docx)
|
||||
quality = inspect_document_quality(
|
||||
output_docx,
|
||||
docx_style_profile=options["docx_style_profile"],
|
||||
numbering_mode=options["numbering_mode"],
|
||||
document_kind=options["document_kind"],
|
||||
)
|
||||
return {
|
||||
"status": "ok",
|
||||
"output_docx": str(output_docx),
|
||||
"block_count": len(blocks),
|
||||
"blocks": block_reports,
|
||||
"final_summary": final_index["summary"],
|
||||
"docx_style_profile": options["docx_style_profile"],
|
||||
"numbering_mode": options["numbering_mode"],
|
||||
"document_kind": options["document_kind"],
|
||||
"template_docx": str(Path(template_docx).resolve()) if template_docx else None,
|
||||
**quality,
|
||||
}
|
||||
|
||||
|
||||
@ -205,6 +657,7 @@ def export_outline_artifacts(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
business_json = Path(payload["business_outline_json"]).resolve()
|
||||
technical_docx = Path(payload["technical_docx"]).resolve()
|
||||
business_docx = Path(payload["business_docx"]).resolve()
|
||||
options = resolve_generation_options(payload)
|
||||
|
||||
for outline_name, outline in (("technical_outline", technical_outline), ("business_outline", business_outline)):
|
||||
if not isinstance(outline, dict):
|
||||
@ -220,6 +673,10 @@ def export_outline_artifacts(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
"output_docx": str(technical_docx),
|
||||
"title": str(technical_outline.get("title", "技术标目录")),
|
||||
"blocks": technical_outline["blocks"],
|
||||
"docx_style_profile": options["docx_style_profile"],
|
||||
"numbering_mode": options["numbering_mode"],
|
||||
"template_docx": options["template_docx"],
|
||||
"document_kind": "outline",
|
||||
}
|
||||
)
|
||||
business_report = create_docx_document(
|
||||
@ -227,11 +684,19 @@ def export_outline_artifacts(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
"output_docx": str(business_docx),
|
||||
"title": str(business_outline.get("title", "商务及其他目录")),
|
||||
"blocks": business_outline["blocks"],
|
||||
"docx_style_profile": options["docx_style_profile"],
|
||||
"numbering_mode": options["numbering_mode"],
|
||||
"template_docx": options["template_docx"],
|
||||
"document_kind": "outline",
|
||||
}
|
||||
)
|
||||
|
||||
return {
|
||||
"status": "ok",
|
||||
"docx_style_profile": options["docx_style_profile"],
|
||||
"numbering_mode": options["numbering_mode"],
|
||||
"document_kind": "outline",
|
||||
"template_docx": str(Path(options["template_docx"]).resolve()) if options["template_docx"] else None,
|
||||
"technical_outline_json": str(technical_json),
|
||||
"business_outline_json": str(business_json),
|
||||
"technical_docx": str(technical_docx),
|
||||
@ -722,12 +1187,14 @@ def apply_patch_document(patch_data: dict[str, Any]) -> dict[str, Any]:
|
||||
source_docx = Path(patch_data["source_docx"]).resolve()
|
||||
output_docx = Path(patch_data.get("output_docx", source_docx)).resolve()
|
||||
in_place = bool(patch_data.get("in_place", False))
|
||||
options = resolve_generation_options(patch_data)
|
||||
if not in_place and output_docx == source_docx:
|
||||
raise QueryError("output_docx must differ from source_docx unless in_place is true")
|
||||
output_docx.parent.mkdir(parents=True, exist_ok=True)
|
||||
if not in_place:
|
||||
shutil.copy2(source_docx, output_docx)
|
||||
document = Document(str(output_docx))
|
||||
initialize_default_bid_styles(document, options["docx_style_profile"])
|
||||
operations = patch_data.get("operations", [])
|
||||
operation_reports: list[dict[str, Any]] = []
|
||||
|
||||
@ -781,37 +1248,62 @@ def apply_patch_document(patch_data: dict[str, Any]) -> dict[str, Any]:
|
||||
}
|
||||
)
|
||||
|
||||
apply_heading_numbering(document, options["numbering_mode"])
|
||||
apply_document_profile(document, options["docx_style_profile"])
|
||||
document.save(str(output_docx))
|
||||
final_index = index_document(output_docx)
|
||||
quality = inspect_document_quality(
|
||||
output_docx,
|
||||
docx_style_profile=options["docx_style_profile"],
|
||||
numbering_mode=options["numbering_mode"],
|
||||
document_kind=options["document_kind"],
|
||||
)
|
||||
return {
|
||||
"status": "ok",
|
||||
"source_docx": str(source_docx),
|
||||
"output_docx": str(output_docx),
|
||||
"in_place": in_place,
|
||||
"docx_style_profile": options["docx_style_profile"],
|
||||
"numbering_mode": options["numbering_mode"],
|
||||
"template_docx": str(Path(options["template_docx"]).resolve()) if options["template_docx"] else None,
|
||||
"operation_count": len(operations),
|
||||
"operations": operation_reports,
|
||||
"errors": [],
|
||||
"warnings": [],
|
||||
"final_summary": final_index["summary"],
|
||||
**quality,
|
||||
}
|
||||
|
||||
|
||||
def render_docx(docx_path: Path, out_dir: Path) -> dict[str, Any]:
|
||||
def render_docx(docx_path: Path, out_dir: Path, *, docx_style_profile: str = DEFAULT_DOCX_STYLE_PROFILE, numbering_mode: str = DEFAULT_NUMBERING_MODE, document_kind: str = DEFAULT_DOCUMENT_KIND) -> dict[str, Any]:
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
pdf_path = out_dir / f"{docx_path.stem}.pdf"
|
||||
png_dir = out_dir / "pages"
|
||||
png_dir.mkdir(parents=True, exist_ok=True)
|
||||
soffice = shutil.which("soffice")
|
||||
if not soffice:
|
||||
return {
|
||||
report = {
|
||||
"status": "render_skipped",
|
||||
"docx": str(docx_path),
|
||||
"docx_style_profile": docx_style_profile,
|
||||
"numbering_mode": numbering_mode,
|
||||
"document_kind": document_kind,
|
||||
"pdf": None,
|
||||
"page_count": 0,
|
||||
"images": [],
|
||||
"errors": [],
|
||||
"warnings": ["LibreOffice/soffice not found"],
|
||||
}
|
||||
report.update(
|
||||
inspect_document_quality(
|
||||
docx_path,
|
||||
docx_style_profile=docx_style_profile,
|
||||
numbering_mode=numbering_mode,
|
||||
document_kind=document_kind,
|
||||
render_status=report["status"],
|
||||
)
|
||||
)
|
||||
return report
|
||||
process = subprocess.run(
|
||||
[soffice, "--headless", "--convert-to", "pdf", "--outdir", str(out_dir), str(docx_path)],
|
||||
capture_output=True,
|
||||
@ -819,15 +1311,28 @@ def render_docx(docx_path: Path, out_dir: Path) -> dict[str, Any]:
|
||||
encoding="utf-8",
|
||||
)
|
||||
if process.returncode != 0 or not pdf_path.exists():
|
||||
return {
|
||||
report = {
|
||||
"status": "error",
|
||||
"docx": str(docx_path),
|
||||
"docx_style_profile": docx_style_profile,
|
||||
"numbering_mode": numbering_mode,
|
||||
"document_kind": document_kind,
|
||||
"pdf": str(pdf_path),
|
||||
"page_count": 0,
|
||||
"images": [],
|
||||
"errors": [process.stderr.strip() or "failed to convert docx to pdf"],
|
||||
"warnings": [],
|
||||
}
|
||||
report.update(
|
||||
inspect_document_quality(
|
||||
docx_path,
|
||||
docx_style_profile=docx_style_profile,
|
||||
numbering_mode=numbering_mode,
|
||||
document_kind=document_kind,
|
||||
render_status=report["status"],
|
||||
)
|
||||
)
|
||||
return report
|
||||
|
||||
images: list[str] = []
|
||||
warnings: list[str] = []
|
||||
@ -842,12 +1347,25 @@ def render_docx(docx_path: Path, out_dir: Path) -> dict[str, Any]:
|
||||
except Exception as exc: # pragma: no cover
|
||||
warnings.append(f"PNG render skipped: {exc}")
|
||||
|
||||
return {
|
||||
report = {
|
||||
"status": "ok",
|
||||
"docx": str(docx_path),
|
||||
"docx_style_profile": docx_style_profile,
|
||||
"numbering_mode": numbering_mode,
|
||||
"document_kind": document_kind,
|
||||
"pdf": str(pdf_path),
|
||||
"page_count": len(images),
|
||||
"images": images,
|
||||
"errors": [],
|
||||
"warnings": warnings,
|
||||
}
|
||||
report.update(
|
||||
inspect_document_quality(
|
||||
docx_path,
|
||||
docx_style_profile=docx_style_profile,
|
||||
numbering_mode=numbering_mode,
|
||||
document_kind=document_kind,
|
||||
render_status=report["status"],
|
||||
)
|
||||
)
|
||||
return report
|
||||
|
||||
@ -19,7 +19,13 @@ def main() -> None:
|
||||
if args.render_check:
|
||||
output_docx = Path(report["output_docx"]).resolve()
|
||||
render_dir = Path(args.render_dir).resolve() if args.render_dir else output_docx.parent / f"{output_docx.stem}_render"
|
||||
report["render"] = render_docx(output_docx, render_dir)
|
||||
report["render"] = render_docx(
|
||||
output_docx,
|
||||
render_dir,
|
||||
docx_style_profile=report.get("docx_style_profile", "default_bid"),
|
||||
numbering_mode=report.get("numbering_mode", "explicit_text"),
|
||||
document_kind=patch_data.get("document_kind", "assembled"),
|
||||
)
|
||||
write_json(Path(args.report).resolve(), report)
|
||||
|
||||
|
||||
|
||||
@ -2,7 +2,9 @@ from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import re
|
||||
from collections import Counter
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from docx_ops_lib import QueryError, read_json, write_json
|
||||
|
||||
@ -24,66 +26,89 @@ ILLEGAL_LEAF_TITLES = {
|
||||
|
||||
TECHNICAL_ROOT_TITLES = {
|
||||
"技术标目录",
|
||||
"服务方案",
|
||||
"技术目录",
|
||||
"技术部分目录",
|
||||
"技术方案",
|
||||
"服务方案",
|
||||
"实施方案",
|
||||
"服务保障及措施",
|
||||
"售后服务和质保期服务计划",
|
||||
}
|
||||
|
||||
BUSINESS_ROOT_TITLES = {
|
||||
"商务及其他目录",
|
||||
"商务目录",
|
||||
"商务部分目录",
|
||||
"商务及其他部分目录",
|
||||
}
|
||||
|
||||
TECHNICAL_PLACEHOLDER_TITLES = {
|
||||
"技术标内容详见技术标目录版",
|
||||
"技术部分详见技术标",
|
||||
"技术部分",
|
||||
"技术标",
|
||||
"技术方案",
|
||||
"服务方案",
|
||||
"实施方案",
|
||||
}
|
||||
|
||||
GENERIC_TECHNICAL_PATTERNS = (
|
||||
r"方案$",
|
||||
r"设计$",
|
||||
r"系统$",
|
||||
r"平台$",
|
||||
r"架构$",
|
||||
r"建设内容$",
|
||||
r"总体思路$",
|
||||
r"总体要求$",
|
||||
r"总体架构$",
|
||||
r"功能设计$",
|
||||
r"集成方案$",
|
||||
r"响应方案$",
|
||||
r"实施技术方案$",
|
||||
r"部署方案$",
|
||||
r"管理方案$",
|
||||
r"验收方案$",
|
||||
r"测试方案$",
|
||||
r"试运行方案$",
|
||||
r"保障措施$",
|
||||
r"服务计划$",
|
||||
r"^(技术|总体技术|总体|项目|整体)?方案$",
|
||||
r"^(服务|运维|培训|实施|部署|测试|验收|应急|保障)(方案|计划|措施)?$",
|
||||
r"^(系统|平台|架构|设计)(方案|设计|建设方案)?$",
|
||||
r"^(项目理解|解决方案|系统设计|总体架构|建设内容|功能设计|集成方案|响应方案|管理方案)$",
|
||||
r"^(总体设计方案|总体实施方案|总体服务方案)$",
|
||||
)
|
||||
|
||||
SPECIFIC_CHILD_HINTS = (
|
||||
OBJECT_HINTS = (
|
||||
"子系统",
|
||||
"模块",
|
||||
"设备",
|
||||
"接口",
|
||||
"功能",
|
||||
"单元",
|
||||
"终端",
|
||||
"节点",
|
||||
"链路",
|
||||
"数据库",
|
||||
"中间件",
|
||||
"服务器",
|
||||
"存储",
|
||||
"网络",
|
||||
"点位",
|
||||
"机房",
|
||||
"服务项",
|
||||
"清单",
|
||||
)
|
||||
|
||||
MANAGEMENT_HINTS = (
|
||||
"原则",
|
||||
"目标",
|
||||
"架构",
|
||||
"模块",
|
||||
"功能",
|
||||
"内容",
|
||||
"配置",
|
||||
"清单",
|
||||
"思路",
|
||||
"策略",
|
||||
"组织",
|
||||
"保障",
|
||||
"计划",
|
||||
"流程",
|
||||
"机制",
|
||||
"计划",
|
||||
"步骤",
|
||||
"标准",
|
||||
"参数",
|
||||
"接口",
|
||||
"部署",
|
||||
"测试",
|
||||
"验收",
|
||||
"措施",
|
||||
"培训",
|
||||
"验收",
|
||||
"测试",
|
||||
"应急",
|
||||
"风险",
|
||||
"保障",
|
||||
"运维",
|
||||
"服务",
|
||||
"售后",
|
||||
"响应",
|
||||
"巡检",
|
||||
"维护",
|
||||
"更新",
|
||||
"子系统",
|
||||
"风险",
|
||||
)
|
||||
|
||||
STEM_SUFFIX_PATTERN = re.compile(
|
||||
r"(总体|项目|技术|系统|平台|服务|实施|运维|售后|培训|测试|验收|保障|管理|响应|交付|部署)?"
|
||||
r"(方案|计划|步骤|措施|机制|说明|内容|设计|建设|保障)?$"
|
||||
)
|
||||
|
||||
def _normalize_heading(text: str) -> str:
|
||||
compact = re.sub(r"\s+", "", text or "")
|
||||
@ -93,68 +118,304 @@ def _normalize_heading(text: str) -> str:
|
||||
return compact
|
||||
|
||||
|
||||
def _issue(issues: list[dict[str, Any]], issue_type: str, path: list[str], message: str) -> None:
|
||||
issues.append({"type": issue_type, "path": " > ".join(path), "message": message})
|
||||
|
||||
|
||||
def _is_heading(block: dict[str, Any]) -> bool:
|
||||
return block.get("type", "heading") == "heading"
|
||||
|
||||
|
||||
def _heading_children(children: list[Any]) -> list[dict[str, Any]]:
|
||||
return [child for child in children if isinstance(child, dict) and _is_heading(child)]
|
||||
|
||||
|
||||
def _is_technical_context(path: list[str]) -> bool:
|
||||
return any(_normalize_heading(part) in TECHNICAL_ROOT_TITLES for part in path)
|
||||
|
||||
|
||||
def _is_business_context(path: list[str]) -> bool:
|
||||
return any(_normalize_heading(part) in BUSINESS_ROOT_TITLES for part in path)
|
||||
|
||||
|
||||
def _technical_depth(path: list[str]) -> int:
|
||||
for index, part in enumerate(path):
|
||||
if _normalize_heading(part) in TECHNICAL_ROOT_TITLES:
|
||||
return len(path) - index
|
||||
return 0
|
||||
|
||||
|
||||
def _contains_object_hint(text: str) -> bool:
|
||||
normalized = _normalize_heading(text)
|
||||
if "系统" in normalized and len(normalized) > 4 and normalized not in ILLEGAL_LEAF_TITLES:
|
||||
return True
|
||||
return any(hint in normalized for hint in OBJECT_HINTS)
|
||||
|
||||
|
||||
def _looks_management_focused(text: str) -> bool:
|
||||
normalized = _normalize_heading(text)
|
||||
return not _contains_object_hint(normalized) and any(hint in normalized for hint in MANAGEMENT_HINTS)
|
||||
|
||||
|
||||
def _looks_generic_technical_heading(text: str) -> bool:
|
||||
normalized = _normalize_heading(text)
|
||||
if normalized in ILLEGAL_LEAF_TITLES:
|
||||
return True
|
||||
if _contains_object_hint(normalized):
|
||||
return False
|
||||
return any(re.search(pattern, normalized) for pattern in GENERIC_TECHNICAL_PATTERNS)
|
||||
|
||||
|
||||
def _has_specific_children(children: list[dict]) -> bool:
|
||||
child_texts = [_normalize_heading(str(child.get("text", "")).strip()) for child in children if isinstance(child, dict)]
|
||||
return any(
|
||||
any(hint in child_text for hint in SPECIFIC_CHILD_HINTS)
|
||||
and not _looks_generic_technical_heading(child_text)
|
||||
for child_text in child_texts
|
||||
def _has_object_child(children: list[dict[str, Any]]) -> bool:
|
||||
return any(_contains_object_hint(str(child.get("text", "")).strip()) for child in children)
|
||||
|
||||
|
||||
def _max_heading_depth(block: dict[str, Any]) -> int:
|
||||
children = block.get("children", [])
|
||||
if not isinstance(children, list):
|
||||
return 1
|
||||
heading_children = _heading_children(children)
|
||||
if not heading_children:
|
||||
return 1
|
||||
return 1 + max(_max_heading_depth(child) for child in heading_children)
|
||||
|
||||
|
||||
def _semantic_stem(text: str) -> str:
|
||||
normalized = _normalize_heading(text)
|
||||
normalized = STEM_SUFFIX_PATTERN.sub("", normalized)
|
||||
normalized = normalized.strip("-_()()")
|
||||
return normalized or _normalize_heading(text)
|
||||
|
||||
|
||||
def _duplicate_generic_stems(children: list[dict[str, Any]]) -> list[str]:
|
||||
stems = [
|
||||
_semantic_stem(str(child.get("text", "")).strip())
|
||||
for child in children
|
||||
if _looks_generic_technical_heading(str(child.get("text", "")).strip())
|
||||
]
|
||||
counts = Counter(stem for stem in stems if len(stem) >= 2)
|
||||
return sorted(stem for stem, count in counts.items() if count >= 2)
|
||||
|
||||
|
||||
def _normalize_policy(payload: dict[str, Any]) -> dict[str, bool]:
|
||||
raw_policy = payload.get("outline_policy", {})
|
||||
if raw_policy is None:
|
||||
raw_policy = {}
|
||||
if not isinstance(raw_policy, dict):
|
||||
raise QueryError("outline_policy must be an object when provided")
|
||||
return {
|
||||
"allow_service_facets": bool(raw_policy.get("allow_service_facets", False)),
|
||||
"respect_fixed_structure": bool(raw_policy.get("respect_fixed_structure", False)),
|
||||
}
|
||||
|
||||
|
||||
def _merge_policy(raw_policy: Any, inherited_policy: dict[str, bool]) -> dict[str, bool]:
|
||||
if raw_policy is None:
|
||||
return dict(inherited_policy)
|
||||
if not isinstance(raw_policy, dict):
|
||||
raise QueryError("policy must be an object when provided on a heading block")
|
||||
return {
|
||||
"allow_service_facets": bool(raw_policy.get("allow_service_facets", inherited_policy["allow_service_facets"])),
|
||||
"respect_fixed_structure": bool(raw_policy.get("respect_fixed_structure", inherited_policy["respect_fixed_structure"])),
|
||||
}
|
||||
|
||||
|
||||
def _parse_heading_level(
|
||||
block: dict[str, Any],
|
||||
path: list[str],
|
||||
issues: list[dict[str, Any]],
|
||||
*,
|
||||
parent_level: int | None,
|
||||
) -> int | None:
|
||||
raw_level = block.get("level")
|
||||
if not isinstance(raw_level, int):
|
||||
_issue(issues, "invalid_heading_level", path, "heading level must be an integer between 1 and 9")
|
||||
return None
|
||||
if raw_level < 1 or raw_level > 9:
|
||||
_issue(issues, "invalid_heading_level", path, "heading level must be between 1 and 9")
|
||||
return None
|
||||
if parent_level is None:
|
||||
if raw_level != 1:
|
||||
_issue(issues, "invalid_root_heading_level", path, "top-level heading must use level 1")
|
||||
elif raw_level != parent_level + 1:
|
||||
_issue(
|
||||
issues,
|
||||
"invalid_heading_hierarchy",
|
||||
path,
|
||||
f"child heading level must be parent level + 1; expected {parent_level + 1}, got {raw_level}",
|
||||
)
|
||||
return raw_level
|
||||
|
||||
|
||||
def _check_technical_depth(blocks: list[dict[str, Any]], issues: list[dict[str, Any]], policy: dict[str, bool]) -> None:
|
||||
for block in blocks:
|
||||
if not isinstance(block, dict) or not _is_heading(block):
|
||||
continue
|
||||
root_text = str(block.get("text", "")).strip()
|
||||
if _normalize_heading(root_text) not in TECHNICAL_ROOT_TITLES:
|
||||
continue
|
||||
root_children = block.get("children", [])
|
||||
if not isinstance(root_children, list):
|
||||
continue
|
||||
branch_children = _heading_children(root_children)
|
||||
if not branch_children:
|
||||
_issue(
|
||||
issues,
|
||||
"technical_outline_too_shallow",
|
||||
[root_text],
|
||||
"technical outline must include at least one level-2 branch under the root",
|
||||
)
|
||||
continue
|
||||
for child in branch_children:
|
||||
child_text = str(child.get("text", "")).strip()
|
||||
branch_path = [root_text, child_text]
|
||||
branch_depth = _max_heading_depth(child)
|
||||
branch_policy = _merge_policy(child.get("policy"), policy)
|
||||
if branch_policy["respect_fixed_structure"] and branch_depth < 2:
|
||||
continue
|
||||
if branch_depth < 2:
|
||||
_issue(
|
||||
issues,
|
||||
"technical_branch_too_shallow",
|
||||
branch_path,
|
||||
f"technical branch '{child_text}' must reach at least level 3",
|
||||
)
|
||||
|
||||
|
||||
def _walk_blocks(blocks: list[dict], path: list[str], issues: list[dict]) -> None:
|
||||
def _walk_blocks(
|
||||
blocks: list[dict[str, Any]],
|
||||
path: list[str],
|
||||
issues: list[dict[str, Any]],
|
||||
policy: dict[str, bool],
|
||||
parent_level: int | None = None,
|
||||
) -> None:
|
||||
for index, block in enumerate(blocks):
|
||||
if not isinstance(block, dict):
|
||||
issues.append({"type": "invalid_block", "path": " > ".join(path + [str(index)]), "message": "block must be an object"})
|
||||
_issue(issues, "invalid_block", path + [str(index)], "block must be an object")
|
||||
continue
|
||||
|
||||
text = str(block.get("text", "")).strip()
|
||||
block_type = block.get("type", "heading")
|
||||
children = block.get("children", [])
|
||||
current_path = path + ([text] if text else [str(index)])
|
||||
if block_type == "heading":
|
||||
if text in ILLEGAL_LEAF_TITLES and not children:
|
||||
issues.append(
|
||||
{
|
||||
"type": "illegal_leaf",
|
||||
"path": " > ".join(current_path),
|
||||
"message": f"abstract heading '{text}' cannot be a leaf",
|
||||
}
|
||||
)
|
||||
if children and not isinstance(children, list):
|
||||
issues.append({"type": "invalid_children", "path": " > ".join(current_path), "message": "children must be a list"})
|
||||
|
||||
if block_type != "heading":
|
||||
continue
|
||||
if isinstance(children, list):
|
||||
if _is_technical_context(current_path):
|
||||
normalized = _normalize_heading(text)
|
||||
direct_heading_children = [child for child in children if isinstance(child, dict) and child.get("type", "heading") == "heading"]
|
||||
if _looks_generic_technical_heading(normalized) and direct_heading_children and not _has_specific_children(direct_heading_children):
|
||||
issues.append(
|
||||
{
|
||||
"type": "insufficient_technical_breakdown",
|
||||
"path": " > ".join(current_path),
|
||||
"message": f"technical heading '{text}' is still too generic; expand to subsystem/module/process level",
|
||||
}
|
||||
|
||||
current_policy = _merge_policy(block.get("policy"), policy)
|
||||
current_level = _parse_heading_level(block, current_path, issues, parent_level=parent_level)
|
||||
|
||||
if text in ILLEGAL_LEAF_TITLES and not children:
|
||||
_issue(
|
||||
issues,
|
||||
"illegal_leaf",
|
||||
current_path,
|
||||
f"abstract heading '{text}' cannot be a leaf",
|
||||
)
|
||||
_walk_blocks(children, current_path, issues)
|
||||
|
||||
if children and not isinstance(children, list):
|
||||
_issue(issues, "invalid_children", current_path, "children must be a list")
|
||||
continue
|
||||
|
||||
if not isinstance(children, list):
|
||||
continue
|
||||
|
||||
direct_heading_children = _heading_children(children)
|
||||
normalized = _normalize_heading(text)
|
||||
in_technical_context = _is_technical_context(current_path)
|
||||
in_business_context = _is_business_context(current_path)
|
||||
|
||||
if in_business_context and normalized in TECHNICAL_PLACEHOLDER_TITLES and direct_heading_children:
|
||||
_issue(
|
||||
issues,
|
||||
"business_technical_placeholder_expanded",
|
||||
current_path,
|
||||
f"business outline technical placeholder '{text}' must remain a single placeholder node",
|
||||
)
|
||||
|
||||
if in_technical_context:
|
||||
technical_depth = _technical_depth(current_path)
|
||||
is_generic_heading = _looks_generic_technical_heading(text)
|
||||
allow_service_facets = current_policy["allow_service_facets"]
|
||||
allow_fixed_structure = current_policy["respect_fixed_structure"]
|
||||
|
||||
if is_generic_heading and normalized not in ILLEGAL_LEAF_TITLES and not direct_heading_children:
|
||||
_issue(
|
||||
issues,
|
||||
"generic_technical_leaf",
|
||||
current_path,
|
||||
f"technical heading '{text}' is still too generic to write from directly",
|
||||
)
|
||||
|
||||
if is_generic_heading and len(direct_heading_children) == 1:
|
||||
_issue(
|
||||
issues,
|
||||
"single_child_breakdown",
|
||||
current_path,
|
||||
f"technical heading '{text}' cannot be expanded with only one direct child",
|
||||
)
|
||||
|
||||
if (
|
||||
is_generic_heading
|
||||
and direct_heading_children
|
||||
and not allow_service_facets
|
||||
and not allow_fixed_structure
|
||||
and not _has_object_child(direct_heading_children)
|
||||
):
|
||||
_issue(
|
||||
issues,
|
||||
"missing_object_breakdown",
|
||||
current_path,
|
||||
f"technical heading '{text}' must include at least one object/module/subsystem oriented child",
|
||||
)
|
||||
|
||||
duplicate_stems = _duplicate_generic_stems(direct_heading_children)
|
||||
if duplicate_stems:
|
||||
joined = ", ".join(duplicate_stems)
|
||||
_issue(
|
||||
issues,
|
||||
"duplicate_technical_facets",
|
||||
current_path,
|
||||
f"technical heading '{text}' has repeated generic child facets: {joined}",
|
||||
)
|
||||
|
||||
if (
|
||||
technical_depth >= 3
|
||||
and not direct_heading_children
|
||||
and not allow_service_facets
|
||||
and _looks_management_focused(text)
|
||||
):
|
||||
_issue(
|
||||
issues,
|
||||
"management_leaf_too_generic",
|
||||
current_path,
|
||||
f"management-style leaf '{text}' is too generic; refine it to an object or concrete deliverable",
|
||||
)
|
||||
|
||||
if technical_depth == 2 and direct_heading_children:
|
||||
if (
|
||||
not allow_service_facets
|
||||
and not allow_fixed_structure
|
||||
and all(_looks_management_focused(str(child.get("text", "")).strip()) for child in direct_heading_children)
|
||||
):
|
||||
_issue(
|
||||
issues,
|
||||
"top_branch_missing_object_nodes",
|
||||
current_path,
|
||||
f"technical branch '{text}' is expanded only by management facets; add module/subsystem/device oriented nodes",
|
||||
)
|
||||
|
||||
_walk_blocks(direct_heading_children, current_path, issues, current_policy, current_level)
|
||||
|
||||
|
||||
def check_outline(payload: dict) -> dict:
|
||||
def check_outline(payload: dict[str, Any]) -> dict[str, Any]:
|
||||
blocks = payload.get("blocks", [])
|
||||
if not isinstance(blocks, list):
|
||||
raise QueryError("blocks must be a list")
|
||||
issues: list[dict] = []
|
||||
_walk_blocks(blocks, [], issues)
|
||||
policy = _normalize_policy(payload)
|
||||
issues: list[dict[str, Any]] = []
|
||||
_walk_blocks(blocks, [], issues, policy)
|
||||
_check_technical_depth(blocks, issues, policy)
|
||||
return {
|
||||
"status": "ok" if not issues else "failed",
|
||||
"issue_count": len(issues),
|
||||
|
||||
Loading…
Reference in New Issue
Block a user