From ec6818b2068084be96714c31eab0ffb36d2d8c81 Mon Sep 17 00:00:00 2001 From: tian <11429339@qq.com> Date: Wed, 15 Apr 2026 12:30:11 +0800 Subject: [PATCH] Add track-aware face alarm design doc --- ...FaceRecognition_TrackAware_Alarm_Design.md | 480 ++++++++++++++++++ 1 file changed, 480 insertions(+) create mode 100644 docs/design/FaceRecognition_TrackAware_Alarm_Design.md diff --git a/docs/design/FaceRecognition_TrackAware_Alarm_Design.md b/docs/design/FaceRecognition_TrackAware_Alarm_Design.md new file mode 100644 index 0000000..74a7c84 --- /dev/null +++ b/docs/design/FaceRecognition_TrackAware_Alarm_Design.md @@ -0,0 +1,480 @@ +# Face Recognition Track-Aware Alarm Design + +## 1. Background + +The current face recognition alarm path is: + +`face_det -> face_recog -> alarm.face_rules -> actions` + +This path is already able to: + +- recognize known persons from the gallery +- classify single-frame results as `known` or `unknown` +- generate `known_person` and `unknown_face` alarms +- upload snapshots and clips through the existing alarm action chain + +However, current alarm behavior is still dominated by single-frame face recognition results. In workshop testing, the same known person can produce: + +- `known_person` alarms on close, high-quality frames +- `unknown_face` alarms on far, small, or low-quality frames + +This is not acceptable for the target workshop scenario. In this scenario: + +- low-quality face observations can be ignored +- alarm accuracy is more important than alarm recall +- alarm frequency must stay low +- known-person alarms are used as attendance punch events +- short leave-and-return behavior should not generate repeated punch alarms + +The repository already has a person tracker and shoe-related logic that rely on `track_id`. This design reuses that capability instead of introducing a second face-specific tracker. + +## 2. Goals + +### 2.1 Functional Goals + +- Reuse existing person `track_id` for face identity aggregation. +- Stop treating `unknown` as the direct opposite of `known`. +- Ignore low-quality face observations instead of forcing them into `unknown_face`. +- Generate face alarms per tracked person instead of per single frame. +- Prevent duplicate alarms while the same person remains in the scene. +- Prevent immediate repeated alarms after a short leave-and-return event. + +### 2.2 Non-Goals + +- Do not replace the existing person tracker implementation. +- Do not introduce a new standalone face tracker. +- Do not change face embedding extraction or gallery search behavior. +- Do not merge detection, recognition, tracking, and alarming into one plugin. + +## 3. Current State Review + +### 3.1 What Already Exists + +- `plugins/tracker/tracker_node.cpp` + - assigns stable `track_id` values to person detections in `frame->det` +- `plugins/logic_gate/logic_gate_node.cpp` + - already consumes person `track_id` for shoe-related reasoning +- `plugins/ai_face_det/*` and `plugins/ai_face_recog/ai_face_recog_node.cpp` + - already produce face detection and recognition results +- `plugins/alarm/alarm_node.cpp` + - already supports face-specific rules and the existing alarm action chain + +### 3.2 Current Gap + +The current face path does not actually reuse person tracking: + +- `FaceDetItem` has a `track_id` field, but face detectors currently fill it with `-1` +- `FaceRecogItem` only carries gallery identity fields such as `best_person_id` +- face alarm vote keys currently use: + - `best_person_id` for known-person rules + - constant `unknown` for unknown-person rules + +This means current face alarm behavior cannot answer: + +- whether two frames belong to the same physical person +- whether a temporary low-score frame belongs to a person already recognized moments earlier +- whether an alarm should be suppressed because the person is still on screen + +## 4. Design Summary + +The design adds a track-aware identity aggregation layer on top of existing person tracking. + +The revised behavior is: + +1. Person detections continue to receive `track_id` from the existing tracker. +2. Each recognized face is associated with one tracked person in the same frame. +3. The associated `person_track_id` is stored on face recognition results. +4. Alarm logic aggregates recognition evidence per `person_track_id`. +5. The alarm decision becomes three-state: + - `known` + - `unknown` + - `uncertain` +6. Only stable `known` or stable `unknown` states may trigger alarms. +7. `uncertain` observations are ignored. + +This preserves the existing DAG architecture and keeps responsibilities separated: + +- tracker tracks persons +- face plugins produce recognition evidence +- alarm node owns business-facing identity confirmation and deduplication + +## 5. Data Flow + +The intended runtime path becomes: + +`person_det -> tracker -> face_det -> face_recog(face-person association) -> alarm(track-aware face rules) -> actions` + +### 5.1 Face-to-Person Association + +For each recognized face in a frame, associate it to one person detection from `frame->det` that already has a valid `track_id`. + +Recommended matching order: + +1. Prefer person boxes that contain the face center point. +2. If multiple person boxes qualify, choose the one with the highest overlap quality. +3. If no containing person box exists, optionally fall back to IoU / overlap ratio matching. +4. If no reliable match exists, leave the face unassociated. + +This keeps the matching logic simple and aligned with the workshop camera scenario, where one face should usually lie inside one person box. + +### 5.2 Result Enrichment + +Extend face recognition results with the associated person track metadata. + +Recommended additions to `FaceRecogItem`: + +- `int person_track_id = -1` +- optional future field: `float person_match_score` + +This lets downstream plugins consume face identity evidence with person continuity information, without coupling them to raw person detections. + +## 6. Identity State Model + +Per `person_track_id`, maintain a short-lived identity aggregation state in the alarm node. + +Recommended tracked fields: + +- `track_id` +- `first_seen_ms` +- `last_seen_ms` +- `last_quality_pass_ms` +- `best_known_person_id` +- `best_known_name` +- `best_sim_peak` +- `best_known_hit_count` +- `quality_pass_count` +- `unknown_candidate_count` +- `reported_known` +- `reported_unknown` +- `last_report_ms` + +The state exists only while the track is active, plus a short retention window needed for re-entry suppression. + +## 7. Three-State Decision Model + +### 7.1 States + +Each person track may be in one of three states: + +- `uncertain` + - insufficient quality or insufficient evidence +- `known` + - stable evidence for one known gallery identity +- `unknown` + - stable evidence that the tracked person is not matching known identities + +### 7.2 Why `uncertain` Is Required + +`uncertain` is the key change for this scenario. + +Examples that should remain `uncertain`: + +- face too small +- poor alignment +- temporary blur +- far-distance observations +- unstable similarity fluctuations +- too few valid observations for the current track + +These observations should not generate any identity alarm. + +## 8. Quality Gating + +Only quality-qualified face observations should participate in identity aggregation. + +Recommended quality checks: + +- associated `person_track_id >= 0` +- face area ratio above configured minimum +- face aspect ratio within configured bounds +- landmarks available when alignment is required +- optional minimum bbox size in pixels +- optional minimum confidence from face detection + +If a frame fails quality gating: + +- do not count it toward `unknown` +- do not count it toward `known` +- keep the track state as `uncertain` + +This directly matches the workshop requirement: low-quality data can be ignored. + +## 9. Known-Person Confirmation + +Known-person confirmation should require repeated evidence for the same gallery identity on the same tracked person. + +Recommended conditions: + +- face passed quality gating +- `best_person_id >= 0` +- `best_sim >= known_accept` +- `(best_sim - second_sim) >= known_margin` +- same `best_person_id` observed at least `known_min_hits` times inside `known_hit_window_ms` + +Optional improvement: + +- allow a peak-sim shortcut when `best_sim` is very high and consistent + +Once the track reaches stable `known`: + +- trigger `known_person` +- mark the track as `reported_known` +- suppress all later known alarms for the same active track + +## 10. Unknown-Person Confirmation + +Unknown-person confirmation must be stricter than known-person confirmation. + +Unknown should not mean: + +- "this frame is not known" + +Unknown should mean: + +- "this tracked person has been observed long enough, at sufficient quality, and still cannot be confirmed as any known person" + +Recommended conditions: + +- face passed quality gating +- valid `person_track_id` +- track age exceeds `unknown_min_track_age_ms` +- quality-qualified observations reach `unknown_min_quality_hits` +- no stable known identity has been confirmed for this track +- recognition remains below known confirmation thresholds during the window + +Optional additional conditions: + +- require the top candidate identity to remain inconsistent +- require multiple low-confidence or ambiguous frames before final unknown confirmation + +Once the track reaches stable `unknown`: + +- trigger `unknown_face` +- mark the track as `reported_unknown` +- suppress all later unknown alarms for the same active track + +## 11. Alarm Deduplication and Re-Entry Control + +### 11.1 Active-Track Deduplication + +Within one active `person_track_id`: + +- `known_person` may trigger at most once +- `unknown_face` may trigger at most once + +### 11.2 Re-Entry Suppression + +The workshop scenario treats known-person alarms as punch events. Therefore: + +- if the same known employee remains on screen, do not re-alarm +- if the same known employee briefly leaves and re-enters, do not re-alarm immediately + +Recommended suppression keys: + +- known person: keyed by `gallery person_id` +- unknown person: keyed by recent track history or a future stronger fingerprint + +Recommended timers: + +- `known_reentry_cooldown_ms` +- `unknown_reentry_cooldown_ms` + +Known-person suppression should be relatively long, because attendance punching should be sparse. + +## 12. Configuration Design + +Introduce a dedicated face track aggregation config section under the alarm face-rule path or a sibling alarm section. + +Recommended fields: + +```json +{ + "face_track_aggregation": { + "enable": true, + "associate_with_person_track": true, + "require_person_track": true, + "person_match_mode": "face_center_in_person", + "person_match_min_iou": 0.05, + "quality": { + "min_face_area_ratio": 0.001, + "min_face_width": 32, + "min_face_height": 32, + "require_landmarks": true + }, + "known": { + "accept": 0.45, + "margin": 0.05, + "min_hits": 3, + "hit_window_ms": 3000, + "reentry_cooldown_ms": 300000 + }, + "unknown": { + "min_track_age_ms": 2000, + "min_quality_hits": 4, + "reentry_cooldown_ms": 300000 + } + } +} +``` + +Notes: + +- exact placement may be adjusted to match current config conventions +- existing face rule fields should remain supported where practical +- migration should minimize breaking existing configs + +## 13. File-Level Changes + +### 13.1 Data Model + +Modify: + +- `include/face/face_result.h` + +Changes: + +- add `person_track_id` to `FaceRecogItem` +- optionally add future-friendly metadata for association confidence + +### 13.2 Face Recognition Node + +Modify: + +- `plugins/ai_face_recog/ai_face_recog_node.cpp` + +Changes: + +- associate each recognized face to a tracked person from `frame->det` +- write `person_track_id` into `FaceRecogItem` +- extend debug log output to include `person_track_id` + +### 13.3 Alarm Node + +Modify: + +- `plugins/alarm/alarm_node.cpp` + +Changes: + +- add track-aware face identity aggregation state +- replace per-frame unknown alarm behavior with track-based unknown confirmation +- change face vote key logic to prefer `person_track_id` +- add deduplication and re-entry suppression based on track-aware identity state + +### 13.4 Tests + +Modify or add: + +- platform-independent unit tests for association logic +- platform-independent unit tests for track-aware known confirmation +- platform-independent unit tests for track-aware unknown suppression +- platform-independent unit tests for re-entry cooldown behavior + +## 14. Compatibility and Migration + +The design should be introduced in a backward-aware way. + +Recommended compatibility strategy: + +1. keep existing face recognition output fields unchanged +2. add new fields instead of renaming old ones +3. keep current config behavior available when track-aware aggregation is disabled +4. allow current face rules to coexist with the new aggregation mode during rollout + +This enables: + +- safer staged rollout +- easier comparison between old and new behavior +- simpler troubleshooting on RK3588 + +## 15. Validation Strategy + +### 15.1 Local Code-Level Validation + +Local validation should focus on platform-independent logic only: + +- face-to-person association behavior +- aggregation state transitions +- known confirmation window logic +- unknown suppression logic +- re-entry cooldown behavior +- config parsing and backward compatibility + +### 15.2 RK3588 Device-Side Validation + +Final validation must be completed on RK3588: + +- known person from far to near + - expect delayed but stable known-person alarm + - no unknown false alarm +- known person brief leave and quick re-entry + - expect no repeated punch alarm +- known person long leave and re-entry after cooldown + - expect one new punch alarm +- truly unknown person with adequate face quality + - expect one unknown alarm after evidence accumulation +- low-quality unknown face + - expect no alarm +- multiple persons in frame + - verify face-person association uses the correct person track + +## 16. Risks and Mitigations + +### Risk 1: Incorrect face-person association + +Impact: + +- identity evidence may be attached to the wrong tracked person + +Mitigation: + +- start with simple center-in-box matching +- log `person_track_id` and association decisions in debug mode +- validate on multi-person RK3588 scenes + +### Risk 2: Unknown confirmation becomes too conservative + +Impact: + +- unknown alarms may be delayed or reduced + +Mitigation: + +- make unknown thresholds configurable +- prefer under-reporting over false workshop alerts in early rollout + +### Risk 3: Re-entry suppression too aggressive + +Impact: + +- valid repeated attendance events may be skipped + +Mitigation: + +- make re-entry cooldown configurable +- document business interpretation clearly as punch-style attendance + +## 17. Rollout Recommendation + +Recommended rollout order: + +1. add `person_track_id` propagation and debug logs +2. add track-aware known confirmation +3. add conservative track-aware unknown confirmation +4. add re-entry suppression tuning +5. validate behavior on RK3588 with known and unknown workshop videos + +This staged rollout reduces risk and allows behavior comparison at each step. + +## 18. Expected Outcome + +After this design is implemented, the system should behave as follows in the workshop face-recognition scenario: + +- poor-quality face observations are ignored +- the same known employee is confirmed from accumulated evidence, not single-frame luck +- transient low-score frames do not become stranger alarms +- a person who stays in scene triggers at most one identity alarm +- short leave-and-return behavior does not trigger repeated punch alarms +- stranger alarms become rarer but more trustworthy + +This is the intended trade-off for the workshop deployment: lower alarm frequency and higher alarm precision.