Doni/OrangePi3588Media

Fork 0

sladro e0827e984e feat: add pose behavior pipeline

2026-04-02 17:44:36 +08:00

16 KiB

Raw Blame History

配置文件编写指南

本文档说明当前项目主线配置的写法，重点覆盖两阶段鞋检测、劳保鞋颜色判断、告警节流，以及 RK3588 上与性能稳定性直接相关的参数。

1. 当前推荐架构

对车间劳保鞋场景，当前推荐链路是：

input_rtsp
  -> preprocess(rgb)
  -> ai_yolo(person only)
  -> tracker(person only)
  -> ai_shoe_det(dynamic roi)
  -> logic_gate(person_shoe_check)
  -> logic_gate(ppe_boots_check)
  -> osd
  -> preprocess(nv12)
  -> publish
  -> alarm

业务目标：

先检测到人
再只在脚部 ROI 检鞋
只关心“检测到鞋后，颜色是否接近黑色劳保鞋”
不再把“没穿鞋”作为主告警目标

2. 配置结构

当前项目常用的是 graph 模式：

{
  "queue": {
    "size": 8,
    "strategy": "drop_oldest"
  },
  "graphs": [
    {
      "name": "person_shoe_two_stage_workshoe_alarm",
      "executor": {
        "batch_size": 2,
        "run_budget": 8
      },
      "nodes": [ ... ],
      "edges": [ ... ]
    }
  ]
}

2.1 queue

{
  "queue": {
    "size": 8,
    "strategy": "drop_oldest"
  }
}

说明：

size：默认队列长度
strategy
- drop_oldest：推荐，实时性最好
- drop_newest
- block

2.2 executor

{
  "executor": {
    "batch_size": 2,
    "run_budget": 8
  }
}

说明：

batch_size：执行器单次批量处理帧数
run_budget：单次调度预算

在 RK3588 上，如果出现“隔几秒卡一下”，通常先减小 run_budget，再看是否需要调 batch_size。

3. 关键节点

3.1 input_rtsp

{
  "id": "in",
  "type": "input_rtsp",
  "role": "source",
  "enable": true,
  "url": "rtsp://10.0.0.49:8554/cam",
  "fps": 30,
  "width": 1920,
  "height": 1080,
  "use_ffmpeg": true,
  "use_mpp": false,
  "force_tcp": true
}

当前推荐：

对会在固定画面卡顿的 RTSP 源，优先使用 use_ffmpeg: true、use_mpp: false
原因是部分源流在 ffmpeg demux + mpp decode 路径上会出现固定位置卡顿，而 VLC 直拉源流正常

3.2 preprocess

{
  "id": "pre_rgb",
  "type": "preprocess",
  "role": "filter",
  "enable": true,
  "dst_w": 1920,
  "dst_h": 1080,
  "dst_format": "rgb",
  "dst_packed": true,
  "resize_mode": "stretch",
  "rga_gate": "person_shoe_two_stage_workshoe_alarm",
  "use_rga": true
}

说明：

前级 pre_rgb 准备共享的高分辨率 RGB 主帧
后级 post 再转回 nv12 给编码器
use_rga: true 是 RK3588 上的推荐项

3.3 ai_yolo

{
  "id": "person_det",
  "type": "ai_yolo",
  "role": "filter",
  "enable": true,
  "infer_fps": 2,
  "infer_phase_ms": 0,
  "use_rga": true,
  "use_dma_input": true,
  "model_path": "./models/yolov8n-640.rknn",
  "model_version": "v8",
  "model_w": 640,
  "model_h": 640,
  "num_classes": 80,
  "conf": 0.35,
  "nms": 0.45,
  "class_filter": [0],
  "bbox_expand": {
    "enable": true,
    "class_id": 0,
    "left": 0.06,
    "right": 0.06,
    "top": 0.04,
    "bottom": 0.16
  }
}

当前用途：

只做人检测前级
不再直接承担鞋检测

关键参数：

参数	作用	当前建议
`infer_fps`	人体检测频率	`2`
`infer_phase_ms`	与鞋检错峰	`0`
`class_filter`	只保留 `person`	`[0]`
`bbox_expand.bottom`	补偿漏脚	`0.16` 左右
`use_rga` / `use_dma_input`	输入性能优化	建议开启

3.4 tracker

{
  "id": "person_trk",
  "type": "tracker",
  "role": "filter",
  "enable": true,
  "mode": "bytetrack_lite",
  "per_class": true,
  "track_classes": [0],
  "high_th": 0.55,
  "low_th": 0.10,
  "iou_th": 0.3,
  "max_age_ms": 900,
  "min_hits": 2
}

当前用途：

稳定人框
为鞋子关联和按人节流提供 track_id

3.5 ai_shoe_det

{
  "id": "shoe_det",
  "type": "ai_shoe_det",
  "role": "filter",
  "enable": true,
  "infer_fps": 2,
  "infer_phase_ms": 150,
  "use_rga": true,
  "use_dma_input": false,
  "model_path": "./models/shoe_detector_openimages_ppe_v1.rknn",
  "model_w": 640,
  "model_h": 640,
  "conf": 0.15,
  "nms": 0.45,
  "v8_box_format": "cxcywh",
  "append_detections": true,
  "dynamic_roi": {
    "enable": true,
    "person_class_id": 0,
    "shoe_class_id": 1,
    "debug_roi_class_id": -1,
    "max_rois": 3,
    "min_person_height": 60,
    "x_offset": -0.24,
    "y_offset": 0.64,
    "width_scale": 1.48,
    "height_scale": 0.58
  }
}

当前用途：

从人框动态生成脚部 ROI
只在 ROI 内跑鞋模型
将鞋框追加回 frame->det

关键参数：

参数	作用	当前建议
`conf`	鞋候选召回阈值	`0.15`
`append_detections`	保留人框并追加鞋框	必须 `true`
`dynamic_roi.max_rois`	每帧最多处理多少人	`3`
`dynamic_roi.min_person_height`	过滤太远的人	`60`
`x_offset/y_offset/width_scale/height_scale`	脚部 ROI 形状	按高机位场景调优

说明：

shoe_det.conf 看起来偏低，这是有意为之
当前方案依赖“低阈值召回 + 人鞋关联 + 颜色判断 + 告警节流”整体收敛
在高机位小鞋场景中，如果把这里直接调到 0.4 或 0.5，通常会明显漏检

3.6 logic_gate

模式一：person_shoe_check

{
  "id": "shoe_assoc",
  "type": "logic_gate",
  "role": "filter",
  "enable": true,
  "mode": "person_shoe_check",
  "person_shoe_check": {
    "person_class": 0,
    "shoe_class": 1,
    "violation_class": 2,
    "min_person_score": 0.30,
    "min_shoe_score": 0.15,
    "require_person_track_id": true,
    "attach_person_track_to_shoe": true,
    "emit_missing_violation": false,
    "foot_region": {
      "x_offset": -0.24,
      "y_offset": 0.64,
      "width_scale": 1.48,
      "height_scale": 0.58
    }
  }
}

当前用途：

把鞋关联到对应的人
把人的 track_id 传给鞋
不再输出“没鞋违规”

模式二：ppe_boots_check

{
  "id": "shoe_color",
  "type": "logic_gate",
  "role": "filter",
  "enable": true,
  "mode": "ppe_boots_check",
  "anchor_class": 0,
  "boots_class": 1,
  "violation_class": 2,
  "color_check": {
    "enable": true,
    "method": "brightness",
    "dark_threshold": 90,
    "roi_expand": 1.0
  }
}

当前用途：

只对已检测到的鞋做颜色判断
深色鞋：视为合规
非深色鞋：追加 cls=2 的违规框，供 OSD 和 alarm 使用

颜色参数建议：

参数	含义	当前建议
`method`	颜色判断方式	`brightness`
`dark_threshold`	深色阈值	`90`
`roi_expand`	颜色分析区域扩展	`1.0`

3.7 osd

{
  "id": "osd",
  "type": "osd",
  "role": "filter",
  "enable": true,
  "draw_bbox": true,
  "draw_text": false,
  "use_rga_bbox": false,
  "labels": ["person", "shoe", "non_black_shoe"]
}

当前显示含义：

person
shoe
non_black_shoe

3.8 publish

{
  "id": "pub",
  "type": "publish",
  "role": "filter",
  "enable": true,
  "codec": "h264",
  "fps": 30,
  "bitrate_kbps": 2000,
  "mpp_output_timeout_ms": 50,
  "mpp_packet_wait_ms": 10,
  "use_mpp": true,
  "outputs": [
    {"proto": "rtsp_server", "port": 8555, "path": "/live/cam1"}
  ]
}

当前推荐：

输入端优先 FFmpeg CPU decode
输出端继续使用 MPP 编码

3.9 alarm

{
  "id": "alarm",
  "type": "alarm",
  "role": "sink",
  "enable": true,
  "eval_fps": 2,
  "labels": ["person", "shoe", "non_black_shoe"],
  "rules": [
    {
      "name": "non_compliant_workshoe",
      "class_ids": [2],
      "roi": {"x": 0.0, "y": 0.0, "w": 1.0, "h": 1.0},
      "min_score": 0.30,
      "require_track_id": false,
      "min_duration_ms": 800,
      "min_hits": 2,
      "hit_window_ms": 2000,
      "cooldown_ms": 15000,
      "per_track_cooldown_ms": 0
    }
  ],
  "actions": {
    "log": {
      "enable": true,
      "level": "info",
      "include_detections": true,
      "min_interval_ms": 2000
    }
  }
}

当前告警策略：

只对 cls=2 non_black_shoe 告警
蓝框稳定出现 2 次以上，且持续约 800ms，才触发
触发后进入 15s 冷却，避免反复刷屏

4. 连接关系

{
  "edges": [
    ["in", "pre_rgb"],
    ["pre_rgb", "person_det"],
    ["person_det", "person_trk"],
    ["person_trk", "shoe_det"],
    ["shoe_det", "shoe_assoc"],
    ["shoe_assoc", "shoe_color"],
    ["shoe_color", "osd"],
    ["osd", "post"],
    ["post", "pub"],
    ["pub", "alarm"]
  ]
}

说明：

publish -> alarm 是合法链路
alarm 会继续读取前面节点保留下来的 frame->det

5. 推荐配置

配置文件	说明
`configs/person_shoe_two_stage_workshoe_alarm_v8s_shoe640.json`	单路劳保鞋颜色告警主线配置
`configs/person_shoe_two_stage_workshoe_alarm_v8s_shoe640_strict.json`	单路劳保鞋颜色告警严格版
`configs/full_pipeline_1080p.json`	人脸检测/识别 + 劳保鞋颜色告警完整流程

6. 调参顺序

建议按下面顺序调，不要同时乱改：

ai_yolo.conf 和 bbox_expand 目标：先保证人体框稳定且脚下留出空间
ai_shoe_det.conf 和 dynamic_roi 目标：先让鞋框能出来
logic_gate.color_check.dark_threshold 目标：区分黑色劳保鞋和拖鞋/浅色鞋
alarm.min_hits / min_duration_ms / cooldown_ms 目标：把“会报警”收敛成“稳一点再报警”

6.1 实施人员重点参数

实施时优先看下面这些参数，不建议一开始改其它项。

节点	参数	期望效果	调大后的典型结果	调小后的典型结果
`input_rtsp`	`use_ffmpeg / use_mpp`	解决固定画面卡顿	`use_ffmpeg=true` 一般更稳	`use_mpp=true` 某些源可能卡顿
`person_det`	`infer_fps`	控制人体检测负载	召回更及时，但更吃 NPU	更省资源，但人框更新更慢
`person_det`	`bbox_expand.bottom`	给脚部 ROI 留空间	更不容易漏脚	人框更紧，鞋 ROI 更容易裁偏
`shoe_det`	`conf`	控制鞋框召回和误报平衡	误报减少，但漏鞋增多	鞋框更多，但误报也更多
`shoe_det.dynamic_roi`	`max_rois`	控制每帧最多处理多少人	覆盖更多人，但 FPS 下降	更省资源，但多人时会漏掉部分鞋
`shoe_det.dynamic_roi`	`min_person_height`	过滤远景小人	远处小人不参与鞋检	更多远景人进入鞋检
`shoe_det.dynamic_roi`	`max_box_area_ratio`	过滤接近整块脚区的大误框	大蓝框减少	更容易保留整块 ROI 级误报
`shoe_assoc`	`min_shoe_score`	控制低分鞋框是否参与人鞋关联	低分误框减少	弱鞋框更容易进入后续逻辑
`shoe_color`	`dark_threshold`	控制黑鞋/非黑鞋边界	更多深灰鞋被判合规	更多鞋会被判非黑鞋
`alarm`	`min_duration_ms`	控制要稳定多久才报警	更稳，但慢一点	更灵敏，但更容易闪报
`alarm`	`cooldown_ms`	控制两次告警间隔	减少重复告警	同一事件会更频繁重复报

6.2 推荐调参动作

现场遇到问题时，优先按下面方式处理：

鞋子漏报多：
- 先降低 shoe_det.conf
- 再降低 shoe_assoc.min_shoe_score
- 必要时降低 dynamic_roi.min_person_height
蓝框误报多：
- 先提高 shoe_det.conf
- 再提高 shoe_assoc.min_shoe_score
- 如出现整块脚区级误框，再降低 max_box_area_ratio
黑色劳保鞋被误报：
- 提高 shoe_color.dark_threshold
非黑鞋不报警：
- 降低 shoe_color.dark_threshold
- 检查 alarm.min_score
- 检查 alarm.min_duration_ms / min_hits
完整流程 FPS 不够：
- 先降低 face_det.infer_fps
- 再降低 face_recog.infer_fps
- 再考虑降低 dynamic_roi.max_rois

7. 常见问题

Q1: 为什么鞋子检测阈值只有 0.15？

因为当前场景是高机位、小鞋目标，单纯靠高阈值会先漏掉大量鞋框。系统依赖多级过滤，而不是单级高阈值。

Q2: 为什么有蓝框却不告警？

优先检查：

是否跑的是 person_shoe_two_stage_workshoe_alarm_v8s_shoe640.json
alarm.rules[].class_ids 是否包含 2
min_score / min_hits / min_duration_ms 是否过严

Q3: 为什么 RTSP 会固定画面卡顿？

如果 VLC 直拉源流不卡，而项目里卡，优先切到：

"use_ffmpeg": true,
"use_mpp": false

这通常说明问题在输入解码兼容性，而不是 AI 链路。

版本：v2.0
更新日期：2026-03-15

Pose-Enabled Behavior Configuration

This repository now supports two behavior deployment modes.

Mode 1: Baseline Behavior Graph

Use this when you only need tracked detections and rule-based region behavior:

input_rtsp -> preprocess -> ai_yolo -> tracker -> region_event -> action_recog -> event_fusion

Reference config:

configs/sample_region_behavior_intrusion.json

Characteristics:

No pose model dependency
Lowest compute cost
fall and fight rely on bbox-only rules

Mode 2: Pose-Enabled Behavior Graph

Use this when you need pose-aware fall and fight:

input_rtsp -> preprocess -> ai_yolo -> tracker -> ai_pose -> pose_assoc -> region_event -> action_recog -> event_fusion

Reference config:

configs/sample_region_behavior_full.json

Characteristics:

ai_pose writes Frame.pose
pose_assoc assigns PoseItem.track_id
action_recog can fuse bbox and pose signals per event rule

Optional Integration Rules

ai_pose is optional. Do not add it to graphs that only need intrusion or climb.
pose_assoc should be placed after ai_pose and after tracked detections are already available on the frame.
action_recog must remain runnable when Frame.pose is missing or empty.

Degradation Semantics

Without ai_pose, the graph must still behave correctly with bbox-only behavior logic.
With ai_pose but without pose_assoc, pose.track_id is not guaranteed and downstream logic should treat it as optional compatibility mode.
If pose inference returns no items for a frame, the rest of the graph must keep using Frame.det and track_id.

Structured `action_recog` Rules

action_recog supports a structured event format with bbox, pose, and fusion sections.

Example:

{
  "id": "action_evt",
  "type": "action_recog",
  "events": [
    {
      "type": "fall",
      "window_ms": 1500,
      "activate_duration_ms": 300,
      "bbox": {
        "enabled": true,
        "min_drop_pixels": 120,
        "min_aspect_ratio_delta": 0.35
      },
      "pose": {
        "enabled": true,
        "min_torso_drop_pixels": 120,
        "max_upright_ratio": 0.60
      },
      "fusion": {
        "match_mode": "any"
      }
    },
    {
      "type": "fight",
      "window_ms": 1200,
      "activate_duration_ms": 200,
      "bbox": {
        "enabled": true,
        "proximity_pixels": 220,
        "min_motion_pixels": 90
      },
      "pose": {
        "enabled": true,
        "min_wrist_motion_pixels": 120,
        "max_wrist_distance_pixels": 120
      },
      "fusion": {
        "match_mode": "any"
      }
    }
  ]
}

Rules:

bbox.enabled=false disables bbox signal evaluation for that event.
pose.enabled=false disables pose signal evaluation for that event.
fusion.match_mode="any" means either bbox or pose may activate the event.
fusion.match_mode="all" means bbox and pose must both activate the event.

Backward compatibility:

Old flat keys such as min_drop_pixels and pose_min_torso_drop_pixels are still accepted.
New configs should prefer the structured form above.

16 KiB Raw Blame History Unescape Escape

配置文件编写指南

1. 当前推荐架构

2. 配置结构

2.1 queue

2.2 executor

3. 关键节点

3.1 input_rtsp

3.2 preprocess

3.3 ai_yolo

3.4 tracker

3.5 ai_shoe_det

3.6 logic_gate

模式一：person_shoe_check

模式二：ppe_boots_check

3.7 osd

3.8 publish

3.9 alarm

4. 连接关系

5. 推荐配置

6. 调参顺序

6.1 实施人员重点参数

6.2 推荐调参动作

7. 常见问题

Q1: 为什么鞋子检测阈值只有 0.15？

Q2: 为什么有蓝框却不告警？

Q3: 为什么 RTSP 会固定画面卡顿？

Pose-Enabled Behavior Configuration

Mode 1: Baseline Behavior Graph

Mode 2: Pose-Enabled Behavior Graph

Optional Integration Rules

Degradation Semantics

Structured action_recog Rules

16 KiB

Raw Blame History

Structured `action_recog` Rules