快速判断
官方技能,使用智谱GLM-OCR API从图片和PDF中识别并提取表格,并将其转换为Markdown格式。支持复杂表格、合并单元格以及多页文档。当用户想要提取表格、识别电子表格或将表格图像转换为可编辑格式时,请使用此技能。
适合任务
- 按 ModelScope 收录说明完成平台、开发或工作流任务。
- 通过下载包离线保存 Skill 内容。
- 结合下载量、访问量和喜欢数评估优先级。
输入与输出
输入:任务目标、上下文材料、平台信息、文件路径、约束条件或需要处理的内容。
输出:按 Skill 说明生成的文档、代码、检查结果、计划、建议或操作步骤。
示例任务
- 使用 glmocr-table 帮我完成当前任务,并先确认必要上下文。
- 根据 glmocr-table 的说明,列出操作步骤和风险检查点。
安装方式
- 下载本站提供的 Skill ZIP 并解压。
- 把解压后的 Skill 目录放入当前 AI 工具支持的
skills目录。 - 如需在线查看原始内容,可打开 GitHub 的
SKILL.md。
风险边界
使用前请检查权限、外部依赖和要处理的数据类型。第三方平台数据、支付、部署、账号和密钥相关内容应先核对官方说明。
SKILL.md 文档介绍
GLM-OCR Table Recognition Skill / GLM-OCR 表格识别技能
Extract tables from images and PDFs and convert them to Markdown format using the ZhiPu GLM-OCR layout parsing API.
When to Use / 使用场景
- Extract tables from images or scanned documents / 从图片或扫描件中提取表格
- Convert table images to Markdown or Excel format / 将表格图片转为 Markdown 或可编辑格式
- Recognize complex tables with merged cells / 识别含合并单元格的复杂表格
- Parse financial statements, invoices, reports with tables / 解析财务报表、发票、带表格的报告
- User mentions "extract table", "recognize table", "表格识别", "提取表格", "表格OCR", "表格转文字"
Key Features / 核心特性
- Complex table support: Handles merged cells, nested tables, multi-row headers
- Markdown output: Tables are output in clean Markdown format, easy to edit and convert
- Multi-page PDF: Supports batch extraction from multi-page PDF documents
- Local file & URL: Supports both local files and remote URLs
Resource Links / 资源链接
| Resource | Link |
| --------------- | ------------------------------------------------------------------------------ |
| Get API Key | 智谱开放平台 API Keys |
| API Docs | Layout Parsing / 版面解析 |
Prerequisites / 前置条件
API Key Setup / API Key 配置(Required / 必需)
脚本通过 ZHIPU_API_KEY 环境变量获取密钥,可与其他智谱技能复用同一个 key。
This script reads the key from the ZHIPU_API_KEY environment variable. Reusing the same key across Zhipu skills is optional.
Get Key / 获取 Key: Visit 智谱开放平台 API Keys to create or copy your key.
Setup options / 配置方式(任选一种):
1. Global config (recommended) / 全局配置(推荐): Set once in openclaw.json under env.vars, all Zhipu skills will share it:
{
"env": {
"vars": {
"ZHIPU_API_KEY": "你的密钥"
}
}
}2. Skill-level config / Skill 级别配置: Set for this skill only in openclaw.json:
{
"skills": {
"entries": {
"glmocr-table": {
"env": {
"ZHIPU_API_KEY": "你的密钥"
}
}
}
}
}3. Shell environment variable / Shell 环境变量: Add to ~/.zshrc:
export ZHIPU_API_KEY="你的密钥"> 💡 如果你已为其他智谱 skill(如 glmocr、glmv-caption、glm-image-generation)配置过 key,它们共享同一个 ZHIPU_API_KEY,无需重复配置。
Security & Transparency / 安全与透明度
- Environment variables used / 使用的环境变量:
ZHIPU_API_KEY(required / 必需)GLM_OCR_TIMEOUT(optional timeout seconds / 可选超时秒数)- Fixed endpoint / 固定官方端点:
https://open.bigmodel.cn/api/paas/v4/layout_parsing - No custom API URL override / 不支持自定义 API URL 覆盖: this avoids accidental key exfiltration via redirected endpoints.
- Raw upstream response is optional / 原始响应默认不返回: use
--include-rawonly when needed for debugging.
⛔ MANDATORY RESTRICTIONS / 强制限制 ⛔
1. ONLY use GLM-OCR API — Execute the script python scripts/glm_ocr_cli.py
2. NEVER parse tables yourself — Do NOT try to extract tables using built-in vision or any other method
3. NEVER offer alternatives — Do NOT suggest "I can try to recognize it" or similar
4. IF API fails — Display the error message and STOP immediately
5. NO fallback methods — Do NOT attempt table extraction any other way
📋 Output Display Rules / 输出展示规则
After running the script, present the OCR result clearly and safely.
- Show extracted table Markdown (
text) in full - Summarization is allowed, but do not hide important extraction failures
- If
layout_detailscontains table-related entries, you may highlight them - If the result file is saved, tell the user the file path
- Show raw upstream response only when explicitly requested or debugging (
--include-raw)
How to Use / 使用方法
Extract from URL / 从 URL 提取
python scripts/glm_ocr_cli.py --file-url "https://example.com/table.png"Extract from Local File / 从本地文件提取
python scripts/glm_ocr_cli.py --file /path/to/table.pngSave Result to File / 保存结果到文件
python scripts/glm_ocr_cli.py --file table.png --output result.json --prettyInclude Raw Upstream Response (Debug Only) / 包含原始上游响应(仅调试)
python scripts/glm_ocr_cli.py --file table.png --output result.json --include-rawCLI Reference / CLI 参数
python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty] [--include-raw]| Parameter | Required | Description |
| ---------------- | -------- | ---------------------------------------------------------------- |
| --file-url | One of | URL to image/PDF |
| --file | One of | Local file path to image/PDF |
| --output, -o | No | Save result JSON to file |
| --pretty | No | Pretty-print JSON output |
| --include-raw | No | Include raw upstream API response in result field (debug only) |
Response Format / 响应格式
{
"ok": true,
"text": "| Column 1 | Column 2 |\n|----------|----------|\n| Data | Data |",
"layout_details": [...],
"result": null,
"error": null,
"source": "/path/to/file",
"source_type": "file",
"raw_result_included": false
}Key fields:
ok— whether extraction succeededtext— extracted text in Markdown (use this for display)layout_details— layout analysis detailserror— error details on failure
Error Handling / 错误处理
API key not configured:
ZHIPU_API_KEY not configured. Get your API key at: https://bigmodel.cn/usercenter/proj-mgmt/apikeys→ Show exact error to user, guide them to configure
Authentication failed (401/403): API key invalid/expired → reconfigure
Rate limit (429): Quota exhausted → inform user to wait
File not found: Local file missing → check path