docs: log Jira Workflow Steward ingest completion
This commit is contained in:
3218
tools/manifest.json
3218
tools/manifest.json
File diff suppressed because it is too large
Load Diff
456
tools/sync.py
456
tools/sync.py
@@ -1,33 +1,221 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Wiki ↔ Raw 三向同步工具
|
||||
================================================================================
|
||||
|
||||
功能:
|
||||
- 检测 raw/ 下文件变化(新增/修改/删除)
|
||||
- 维护 manifest.json 状态映射
|
||||
- 检测 orphan entity/concept(仅报告,不删除)
|
||||
概述
|
||||
----
|
||||
本脚本负责维护 raw/(原始文档层)与 wiki/(知识库层)之间的同步状态。
|
||||
它通过 tools/manifest.json 追踪每个 raw 文件的哈希、摄取状态和 slug 映射,
|
||||
让编码代理(agent)能准确知道哪些文件需要被(重新)摄取到 wiki。
|
||||
|
||||
用法:
|
||||
python tools/sync.py --check 预览变化(不执行)
|
||||
python tools/sync.py --sync 执行同步(更新 manifest)
|
||||
python tools/sync.py --pending 显示待处理文件列表
|
||||
python tools/sync.py --json JSON 行输出(供程序消费)
|
||||
python tools/sync.py --rebuild 从 manifest 重建 wiki/index(兜底)
|
||||
核心功能
|
||||
--------
|
||||
1. 扫描 raw/ 下的 .md 文件,与 manifest 对比,检测新增/删除(不再自动检测 updated)
|
||||
2. 维护 tools/manifest.json 状态映射(hash、slug、ingested 等)
|
||||
3. 标记单个文件为"已摄取",供摄取流程回调
|
||||
4. 批量规范化 manifest 中的 slug(reslug)
|
||||
5. 从 manifest 重建 wiki/index.md(兜底方案)
|
||||
6. 检测 orphan entity/concept(仅报告,不删除)
|
||||
7. 批量或单条修正 source 页面中的 Source File link(对齐 manifest 的 raw 路径)
|
||||
|
||||
manifest.json 格式:
|
||||
--------------------------------------------------------------------------------
|
||||
CLI 用法
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
基础操作:
|
||||
python tools/sync.py --check
|
||||
预览 raw/ 与 manifest 的差异(新增/删除),不写入任何文件。
|
||||
输出为 Markdown 格式,适合人工阅读。
|
||||
|
||||
python tools/sync.py --sync
|
||||
执行完整同步:将 raw/ 的变化写入 manifest,并报告 orphan 页面。
|
||||
当前默认仅处理新增/删除,不会因为已存在文件内容变化而自动重置 ingested。
|
||||
|
||||
python tools/sync.py --sync -v / --verbose
|
||||
同上,但额外列出每个新增/删除文件的详情,以及 orphan 清单。
|
||||
|
||||
python tools/sync.py --pending
|
||||
列出 manifest 中所有 ingested=false 的待摄取文件(人类可读格式)。
|
||||
|
||||
python tools/sync.py --pending --json
|
||||
以单行 JSON 输出待摄取列表,供脚本/agent 消费。
|
||||
|
||||
python tools/sync.py --pending --json --limit 1
|
||||
只返回第一条待摄取文件(返回 "file" 字段而非 "files" 数组)。
|
||||
|
||||
python tools/sync.py --pending --json --limit N
|
||||
返回前 N 条待摄取文件(返回 "files" 数组)。
|
||||
|
||||
python tools/sync.py --json
|
||||
与 --sync 配合:使用 JSON 行流模式输出所有事件,便于程序解析。
|
||||
|
||||
python tools/sync.py --rebuild
|
||||
从 manifest 重建 wiki/index.md。适合 index 损坏或丢失时的兜底恢复。
|
||||
|
||||
Source File link 修正:
|
||||
python tools/sync.py --fix-source-links
|
||||
扫描 manifest 中所有条目,批量修正对应 source 页面里 `## Source File` 下的链接。
|
||||
目标格式统一为:- [[raw/.../your-file.md]]
|
||||
|
||||
python tools/sync.py --fix-source-links --fix-source-target "raw/dir/file.md"
|
||||
只修正指定 raw 条目对应的单个 source 页面(适合每次 ingest 后做单文件校验)。
|
||||
|
||||
python tools/sync.py --fix-source-links --dry-run
|
||||
预览将要修改的数量,不写入文件。
|
||||
|
||||
标记摄取状态:
|
||||
python tools/sync.py --mark-ingested "raw/dir/file.md" --slug my-slug
|
||||
标记指定 raw 文件为已摄取,同时更新 slug、source_path、ingested_at。
|
||||
该命令是摄取工作流的最后一步,应在 wiki/sources/<slug>.md 写入完毕后调用。
|
||||
|
||||
python tools/sync.py --mark-ingested "raw/dir/file.md" --slug my-slug --mark-json
|
||||
同上,但以单行 JSON 输出结果(供脚本消费)。
|
||||
|
||||
python tools/sync.py --reset-failed
|
||||
将所有带 error 标记的 manifest 条目重置为 ingested=false(重新加入待处理队列)。
|
||||
|
||||
slug 管理:
|
||||
python tools/sync.py --reslug
|
||||
批量规范化 manifest 中全部条目的 slug 和 source_path。
|
||||
规则:中文直接保留,ASCII 大写转小写,特殊字符转 `-`,压缩连续 `-`。
|
||||
|
||||
python tools/sync.py --reslug --reslug-target "raw/dir/file.md"
|
||||
只规范化指定文件的 slug。
|
||||
|
||||
python tools/sync.py --reslug --dry-run
|
||||
预览 reslug 变更,不写入 manifest。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
manifest.json 格式
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
路径:tools/manifest.json(与本脚本同目录)
|
||||
|
||||
顶层结构:
|
||||
{
|
||||
"version": 1,
|
||||
"updated_at": "ISO timestamp",
|
||||
"files": {
|
||||
"relative/path/to/file.md": {
|
||||
"hash": "sha256",
|
||||
"modified": "ISO timestamp",
|
||||
"slug": "wiki-source-slug",
|
||||
"source_path": "wiki/sources/slug.md",
|
||||
"ingested": true
|
||||
}
|
||||
"version": 1, // 格式版本,当前固定为 1
|
||||
"updated_at": "2024-01-15T08:00:00Z", // 最后更新时间(UTC ISO 8601),每次写入自动刷新
|
||||
"files": { ... } // key = raw 文件相对仓库根的路径
|
||||
}
|
||||
|
||||
files 中每条记录的结构:
|
||||
{
|
||||
"raw/dir/my-paper.md": {
|
||||
"hash": "a3f1c2d4e5b6a7b8", // sha256 前 16 位,用于检测文件内容变化
|
||||
"modified": "2024-01-15T07:00:00Z", // raw 文件的 mtime(UTC ISO 8601)
|
||||
"slug": "my-paper", // wiki 页面 slug,用于生成 source_path
|
||||
"source_path": "wiki/sources/my-paper.md", // 对应的 wiki source 页面路径
|
||||
"ingested": true, // false = 待摄取;true = 已摄取
|
||||
"ingested_at": "2024-01-15T08:00:00Z", // 摄取完成时间(null 表示未摄取)
|
||||
"error": "..." // 可选,摄取失败时记录错误信息
|
||||
}
|
||||
}
|
||||
|
||||
状态流转:
|
||||
新文件被 --sync 检测到
|
||||
→ ingested=false, ingested_at=null
|
||||
摄取工作流完成后调用 --mark-ingested
|
||||
→ ingested=true, ingested_at=<当前 UTC 时间>
|
||||
当前默认同步策略不自动处理“已存在文件内容变化”
|
||||
→ 已摄取文件不会因 updated 检测而自动重置(避免重复 ingest)
|
||||
摄取失败时由外部流程写入 error 字段
|
||||
→ 使用 --reset-failed 清除,重回待处理队列
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
JSON 输出格式(--json / --mark-json / --pending --json)
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
每行输出一个独立 JSON 对象(JSON Lines 格式),可能的 event 类型:
|
||||
|
||||
{"event": "pending", "rel_path": "...", "slug": "...", "action": "new"}
|
||||
{"event": "deleted_detected","rel_path": "..."}
|
||||
{"event": "sync_complete", "summary": {"pending": N, "deleted": N, "manifest_entries": N},
|
||||
"pending_files": [...], "deleted_files": [...]}
|
||||
{"event": "pending_list", "count": N, "files": [...]} // --pending --json --limit N
|
||||
{"event": "pending_list", "count": N, "file": {...}} // --pending --json --limit 1
|
||||
{"event": "mark_ingested", "rel_path": "...", "slug": "...",
|
||||
"source_path": "...", "modified": "...", "ingested_at": "..."}
|
||||
{"event": "fix_source_links_complete", "summary": {...}, "details": [...]}
|
||||
{"event": "error", "message": "..."}
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
内部函数说明
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
sha256_file(path)
|
||||
计算文件 sha256,返回前 16 位十六进制字符串,用于快速变化检测。
|
||||
|
||||
load_manifest() / save_manifest(manifest)
|
||||
读写 tools/manifest.json;文件不存在或损坏时返回空白 manifest。
|
||||
|
||||
scan_raw()
|
||||
递归扫描 raw/ 下所有 .md 文件,返回 {rel_path: {hash, modified, size, abs_path}}。
|
||||
|
||||
build_slug_from_path(rel_path)
|
||||
从 raw 文件路径生成基础 slug(保留中文,空格/特殊字符转 `-`)。
|
||||
注意:--reslug 使用更严格的 _compute_normalized_slug() 规则。
|
||||
|
||||
check_changes(manifest, raw_files)
|
||||
对比 manifest 与实际文件,当前默认返回新增/删除为主(updated 关闭)。
|
||||
|
||||
run_sync(dry_run, verbose, json_mode)
|
||||
执行完整同步逻辑,更新 manifest,并触发 orphan 检测报告。
|
||||
|
||||
run_check()
|
||||
只读比对,以 Markdown 格式打印差异报告,不修改任何文件。
|
||||
|
||||
run_rebuild()
|
||||
遍历 manifest 中全部条目,重建 wiki/index.md,同时做容错路径匹配和 orphan 检测。
|
||||
|
||||
find_orphan_entity_concept(manifest)
|
||||
扫描 wiki/sources/*.md 中的 [[wikilinks]],找出未被引用的 entity/concept 页面。
|
||||
|
||||
mark_ingested(rel_path, slug, json_mode)
|
||||
将指定 raw 文件标记为已摄取,更新 slug、source_path、hash、ingested_at。
|
||||
rel_path 必须已存在于 manifest(先 --sync 再 --mark-ingested)。
|
||||
|
||||
run_reslug(target_rel_path, dry_run)
|
||||
批量(或单条)规范化 manifest 中的 slug/source_path,
|
||||
使用 _compute_normalized_slug() 规则处理特殊字符。
|
||||
|
||||
run_fix_source_links(target_rel_path, dry_run, json_mode)
|
||||
基于 manifest 修正 source 页面 `## Source File` 下的 raw 路径链接;
|
||||
支持全量和单文件模式。
|
||||
|
||||
_compute_normalized_slug(rel_path)
|
||||
规范化 slug 的核心规则:
|
||||
a. 中文字符直接保留
|
||||
b. ASCII 大写字母转小写
|
||||
c. 空格、标点、特殊符号替换为 `-`
|
||||
d. 连续多个 `-` 压缩为单个,首尾 `-` 去除
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
典型工作流(供 agent 参考)
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
1. 检查有无待摄取文件:
|
||||
python tools/sync.py --pending --json --limit 1
|
||||
|
||||
2. 同步 raw 变化到 manifest:
|
||||
python tools/sync.py --sync
|
||||
|
||||
3. 摄取完成后标记:
|
||||
python tools/sync.py --mark-ingested "raw/papers/my-paper.md" --slug my-paper
|
||||
|
||||
4. 修复 slug 命名:
|
||||
python tools/sync.py --reslug --dry-run # 预览
|
||||
python tools/sync.py --reslug # 应用
|
||||
|
||||
5. 批量修正 Source File link:
|
||||
python tools/sync.py --fix-source-links --dry-run
|
||||
python tools/sync.py --fix-source-links
|
||||
|
||||
6. ingest 后单文件校验:
|
||||
python tools/sync.py --fix-source-links --fix-source-target "raw/papers/my-paper.md"
|
||||
|
||||
7. index 损坏时重建:
|
||||
python tools/sync.py --rebuild
|
||||
"""
|
||||
|
||||
import json
|
||||
@@ -166,20 +354,20 @@ def find_orphan_entity_concept(manifest: dict) -> tuple[list, list]:
|
||||
# ─── 核心同步逻辑 ───────────────────────────────────────────────
|
||||
|
||||
def check_changes(manifest: dict, raw_files: dict) -> dict:
|
||||
"""对比 manifest 和实际 raw 文件,返回变化"""
|
||||
"""对比 manifest 和实际 raw 文件,返回变化。
|
||||
|
||||
当前策略(按需求收敛):
|
||||
- 仅检测 new / deleted
|
||||
- 不再基于 hash 检测 updated(避免仅 mtime 变化导致重复 ingest)
|
||||
"""
|
||||
changes = {"new": [], "updated": [], "deleted": [], "unchanged": []}
|
||||
manifest_files = manifest.get("files", {})
|
||||
|
||||
for rel_path, info in raw_files.items():
|
||||
if rel_path not in manifest_files:
|
||||
changes["new"].append({"rel_path": rel_path, **info})
|
||||
elif info["hash"] != manifest_files[rel_path]["hash"]:
|
||||
changes["updated"].append({
|
||||
"rel_path": rel_path,
|
||||
"old_hash": manifest_files[rel_path]["hash"],
|
||||
**info,
|
||||
})
|
||||
else:
|
||||
# 按新策略:已有文件一律视作 unchanged,不再进入 updated
|
||||
changes["unchanged"].append(rel_path)
|
||||
|
||||
for rel_path in manifest_files:
|
||||
@@ -242,20 +430,41 @@ def run_sync(dry_run: bool = False, verbose: bool = False, json_mode: bool = Fal
|
||||
updated_manifest = manifest.copy()
|
||||
updated_manifest["files"] = manifest.get("files", {}).copy()
|
||||
pending_files = []
|
||||
recovered_files = []
|
||||
|
||||
for f in new:
|
||||
rel_path = f["rel_path"]
|
||||
slug = build_slug_from_path(rel_path)
|
||||
source_path = f"wiki/sources/{slug}.md"
|
||||
source_file = WIKI_DIR / "sources" / f"{slug}.md"
|
||||
|
||||
# 检测 wiki/sources/<slug>.md 是否已存在(manifest 被删除后的恢复场景)
|
||||
already_ingested = source_file.exists()
|
||||
ingested_at = None
|
||||
if already_ingested:
|
||||
# 用 source 文件的 mtime 作为 ingested_at 的近似值
|
||||
try:
|
||||
ingested_at = datetime.fromtimestamp(source_file.stat().st_mtime, tz=timezone.utc).isoformat()
|
||||
except Exception:
|
||||
ingested_at = iso_now()
|
||||
|
||||
if json_mode:
|
||||
print(json.dumps({"event": "pending", "rel_path": rel_path, "slug": slug, "action": "new"}))
|
||||
pending_files.append({"rel_path": rel_path, "abs_path": f["abs_path"], "slug": slug, "action": "new"})
|
||||
action = "recovered" if already_ingested else "new"
|
||||
print(json.dumps({"event": "pending" if not already_ingested else "recovered", "rel_path": rel_path, "slug": slug, "action": action}))
|
||||
if not already_ingested:
|
||||
pending_files.append({"rel_path": rel_path, "abs_path": f["abs_path"], "slug": slug, "action": "new"})
|
||||
else:
|
||||
recovered_files.append({"rel_path": rel_path, "slug": slug, "source_path": source_path})
|
||||
if verbose and not json_mode:
|
||||
print(f" ↺ Recovered (source exists): {rel_path} → {source_path}")
|
||||
|
||||
updated_manifest["files"][rel_path] = {
|
||||
"hash": f["hash"],
|
||||
"modified": f.get("modified"),
|
||||
"slug": slug,
|
||||
"source_path": f"wiki/sources/{slug}.md",
|
||||
"ingested": False,
|
||||
"ingested_at": None,
|
||||
"source_path": source_path,
|
||||
"ingested": already_ingested,
|
||||
"ingested_at": ingested_at,
|
||||
}
|
||||
|
||||
for f in updated:
|
||||
@@ -290,6 +499,7 @@ def run_sync(dry_run: bool = False, verbose: bool = False, json_mode: bool = Fal
|
||||
"event": "sync_complete",
|
||||
"summary": {
|
||||
"pending": len(pending_files),
|
||||
"recovered": len(recovered_files),
|
||||
"deleted": len(deleted_files),
|
||||
"manifest_entries": len(updated_manifest["files"]),
|
||||
},
|
||||
@@ -298,6 +508,8 @@ def run_sync(dry_run: bool = False, verbose: bool = False, json_mode: bool = Fal
|
||||
}))
|
||||
else:
|
||||
log(f"manifest.json updated ({len(updated_manifest['files'])} entries)", "success")
|
||||
if recovered_files:
|
||||
log(f"Recovered (source page exists): {len(recovered_files)}", "info")
|
||||
if verbose:
|
||||
log(f"Pending files for ingestion: {len(pending_files)}", "info")
|
||||
|
||||
@@ -385,7 +597,7 @@ def run_rebuild():
|
||||
]
|
||||
|
||||
files = manifest.get("files", {})
|
||||
sorted_files = sorted(files.items(), key=lambda x: x[1].get("modified", ""), reverse=True)
|
||||
sorted_files = sorted(files.items(), key=lambda x: (x[1].get("ingested_at") or "", x[1].get("modified", "")), reverse=True)
|
||||
|
||||
import re
|
||||
|
||||
@@ -449,12 +661,12 @@ def run_rebuild():
|
||||
|
||||
src_file = find_source_file(slug, info, rel_path)
|
||||
|
||||
# 从 manifest 的 modified 字段提取日期前缀(格式 YYYY-MM-DD)
|
||||
modified_raw = info.get("modified", "")
|
||||
# 从 manifest 的 ingested_at 字段提取日期前缀(格式 YYYY-MM-DD),未摄取则留空
|
||||
date_raw = info.get("ingested_at") or ""
|
||||
date_prefix = ""
|
||||
if modified_raw:
|
||||
if date_raw:
|
||||
try:
|
||||
date_prefix = f"[{modified_raw[:10]}] "
|
||||
date_prefix = f"[{date_raw[:10]}] "
|
||||
except Exception:
|
||||
date_prefix = ""
|
||||
|
||||
@@ -529,6 +741,158 @@ def run_rebuild():
|
||||
print(f"\nDone.")
|
||||
|
||||
|
||||
# ─── 管理接口:修正 source 页面中的 Source File link ─────────────────────────────────────
|
||||
|
||||
def _fix_source_file_link_in_content(content: str, raw_rel_path: str) -> tuple[str, bool, str]:
|
||||
"""修正单个 source 页面中的 `## Source File` 区块。
|
||||
|
||||
目标格式:
|
||||
## Source File
|
||||
- [[raw/.../file.md]]
|
||||
|
||||
返回: (new_content, changed, action)
|
||||
action ∈ {"unchanged", "updated", "inserted_line", "inserted_section"}
|
||||
"""
|
||||
expected_line = f"- [[{raw_rel_path}]]"
|
||||
lines = content.splitlines()
|
||||
had_trailing_newline = content.endswith("\n")
|
||||
|
||||
# 1) 找 `## Source File` 标题
|
||||
heading_idx = None
|
||||
for i, line in enumerate(lines):
|
||||
if line.strip().lower() == "## source file":
|
||||
heading_idx = i
|
||||
break
|
||||
|
||||
# 2) 没有区块:插入一个完整区块(优先插到 frontmatter 之后)
|
||||
if heading_idx is None:
|
||||
insert_at = 0
|
||||
if lines and lines[0].strip() == "---":
|
||||
for j in range(1, len(lines)):
|
||||
if lines[j].strip() == "---":
|
||||
insert_at = j + 1
|
||||
while insert_at < len(lines) and lines[insert_at].strip() == "":
|
||||
insert_at += 1
|
||||
break
|
||||
|
||||
block = ["## Source File", expected_line, ""]
|
||||
new_lines = lines[:insert_at] + block + lines[insert_at:]
|
||||
new_content = "\n".join(new_lines)
|
||||
if had_trailing_newline or new_content:
|
||||
new_content += "\n"
|
||||
return new_content, True, "inserted_section"
|
||||
|
||||
# 3) 在 `## Source File` 到下一个二级标题之间找第一条列表项
|
||||
section_end = len(lines)
|
||||
for j in range(heading_idx + 1, len(lines)):
|
||||
if lines[j].startswith("## "):
|
||||
section_end = j
|
||||
break
|
||||
|
||||
bullet_idx = None
|
||||
for j in range(heading_idx + 1, section_end):
|
||||
if lines[j].strip().startswith("- "):
|
||||
bullet_idx = j
|
||||
break
|
||||
|
||||
if bullet_idx is None:
|
||||
# 没有列表项,直接插入标准链接行
|
||||
lines.insert(heading_idx + 1, expected_line)
|
||||
new_content = "\n".join(lines)
|
||||
if had_trailing_newline or new_content:
|
||||
new_content += "\n"
|
||||
return new_content, True, "inserted_line"
|
||||
|
||||
# 4) 有列表项:替换成 manifest 对应的 raw 路径
|
||||
current = lines[bullet_idx].strip()
|
||||
if current == expected_line:
|
||||
return content, False, "unchanged"
|
||||
|
||||
lines[bullet_idx] = expected_line
|
||||
new_content = "\n".join(lines)
|
||||
if had_trailing_newline or new_content:
|
||||
new_content += "\n"
|
||||
return new_content, True, "updated"
|
||||
|
||||
|
||||
def run_fix_source_links(target_rel_path: str = None, dry_run: bool = False, json_mode: bool = False):
|
||||
"""基于 manifest,校正 source 页面中的 Source File link。
|
||||
|
||||
- 不传 target_rel_path:扫描并修正所有条目
|
||||
- 传 target_rel_path:只处理单个 raw 条目(适合 ingest 后单文件校验)
|
||||
"""
|
||||
manifest = load_manifest()
|
||||
files = manifest.get("files", {})
|
||||
|
||||
if target_rel_path:
|
||||
if target_rel_path not in files:
|
||||
msg = f"target not found in manifest: {target_rel_path}"
|
||||
if json_mode:
|
||||
print(json.dumps({"event": "error", "message": msg}))
|
||||
else:
|
||||
print(red(f" ✗ {msg}"))
|
||||
raise SystemExit(1)
|
||||
targets = [(target_rel_path, files[target_rel_path])]
|
||||
else:
|
||||
targets = list(files.items())
|
||||
|
||||
changed = 0
|
||||
unchanged = 0
|
||||
skipped_no_source_path = 0
|
||||
skipped_source_missing = 0
|
||||
details = []
|
||||
|
||||
for rel_path, info in targets:
|
||||
source_path = info.get("source_path")
|
||||
if not source_path:
|
||||
skipped_no_source_path += 1
|
||||
details.append({"rel_path": rel_path, "status": "skipped_no_source_path"})
|
||||
continue
|
||||
|
||||
src_file = REPO_ROOT / source_path
|
||||
if not src_file.exists():
|
||||
skipped_source_missing += 1
|
||||
details.append({"rel_path": rel_path, "source_path": source_path, "status": "skipped_source_missing"})
|
||||
continue
|
||||
|
||||
original = src_file.read_text(encoding="utf-8")
|
||||
new_content, did_change, action = _fix_source_file_link_in_content(original, rel_path)
|
||||
|
||||
if did_change:
|
||||
changed += 1
|
||||
if not dry_run:
|
||||
src_file.write_text(new_content, encoding="utf-8")
|
||||
details.append({"rel_path": rel_path, "source_path": source_path, "status": "changed", "action": action})
|
||||
else:
|
||||
unchanged += 1
|
||||
details.append({"rel_path": rel_path, "source_path": source_path, "status": "unchanged"})
|
||||
|
||||
summary = {
|
||||
"scanned": len(targets),
|
||||
"changed": changed,
|
||||
"unchanged": unchanged,
|
||||
"skipped_no_source_path": skipped_no_source_path,
|
||||
"skipped_source_missing": skipped_source_missing,
|
||||
"dry_run": dry_run,
|
||||
}
|
||||
|
||||
if json_mode:
|
||||
print(json.dumps({"event": "fix_source_links_complete", "summary": summary, "details": details}, ensure_ascii=False))
|
||||
return
|
||||
|
||||
print(f"\n{bold('=== Fix Source File Links')}\n")
|
||||
print(f" Scanned : {summary['scanned']}")
|
||||
print(f" Changed : {summary['changed']}")
|
||||
print(f" Unchanged : {summary['unchanged']}")
|
||||
print(f" Skipped (no source_path): {summary['skipped_no_source_path']}")
|
||||
print(f" Skipped (source missing): {summary['skipped_source_missing']}")
|
||||
if dry_run:
|
||||
print(f" {yellow('⚠')} Dry-run only, no file written.")
|
||||
else:
|
||||
print(f" {green('✓')} Source File links corrected.")
|
||||
print()
|
||||
|
||||
|
||||
# ─── 管理接口:reslug(批量规范化 manifest slug) ──────────────────────────────────────
|
||||
|
||||
def _compute_normalized_slug(rel_path: str) -> str:
|
||||
@@ -789,6 +1153,16 @@ if __name__ == "__main__":
|
||||
default=None,
|
||||
help="与 --pending --json 配合:限制返回条目数(默认返回全部)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--fix-source-links",
|
||||
action="store_true",
|
||||
help="基于 manifest 修正 source 页面 `## Source File` 下的 raw 路径链接",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--fix-source-target",
|
||||
metavar="REL_PATH",
|
||||
help="与 --fix-source-links 配合:仅修正单个 raw 条目(例如 'raw/AI/file.md')",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--reslug",
|
||||
action="store_true",
|
||||
@@ -810,6 +1184,12 @@ if __name__ == "__main__":
|
||||
if args.mark_ingested:
|
||||
rel = args.mark_ingested[0]
|
||||
mark_ingested(rel, slug=args.slug, json_mode=args.mark_json)
|
||||
elif args.fix_source_links:
|
||||
run_fix_source_links(
|
||||
target_rel_path=args.fix_source_target,
|
||||
dry_run=args.dry_run,
|
||||
json_mode=args.json,
|
||||
)
|
||||
elif args.reslug:
|
||||
run_reslug(target_rel_path=args.reslug_target, dry_run=args.dry_run)
|
||||
elif args.rebuild:
|
||||
|
||||
Reference in New Issue
Block a user