Add track-aware face alarm design doc
This commit is contained in:
parent
9700bb85bb
commit
ec6818b206
480
docs/design/FaceRecognition_TrackAware_Alarm_Design.md
Normal file
480
docs/design/FaceRecognition_TrackAware_Alarm_Design.md
Normal file
@ -0,0 +1,480 @@
|
||||
# Face Recognition Track-Aware Alarm Design
|
||||
|
||||
## 1. Background
|
||||
|
||||
The current face recognition alarm path is:
|
||||
|
||||
`face_det -> face_recog -> alarm.face_rules -> actions`
|
||||
|
||||
This path is already able to:
|
||||
|
||||
- recognize known persons from the gallery
|
||||
- classify single-frame results as `known` or `unknown`
|
||||
- generate `known_person` and `unknown_face` alarms
|
||||
- upload snapshots and clips through the existing alarm action chain
|
||||
|
||||
However, current alarm behavior is still dominated by single-frame face recognition results. In workshop testing, the same known person can produce:
|
||||
|
||||
- `known_person` alarms on close, high-quality frames
|
||||
- `unknown_face` alarms on far, small, or low-quality frames
|
||||
|
||||
This is not acceptable for the target workshop scenario. In this scenario:
|
||||
|
||||
- low-quality face observations can be ignored
|
||||
- alarm accuracy is more important than alarm recall
|
||||
- alarm frequency must stay low
|
||||
- known-person alarms are used as attendance punch events
|
||||
- short leave-and-return behavior should not generate repeated punch alarms
|
||||
|
||||
The repository already has a person tracker and shoe-related logic that rely on `track_id`. This design reuses that capability instead of introducing a second face-specific tracker.
|
||||
|
||||
## 2. Goals
|
||||
|
||||
### 2.1 Functional Goals
|
||||
|
||||
- Reuse existing person `track_id` for face identity aggregation.
|
||||
- Stop treating `unknown` as the direct opposite of `known`.
|
||||
- Ignore low-quality face observations instead of forcing them into `unknown_face`.
|
||||
- Generate face alarms per tracked person instead of per single frame.
|
||||
- Prevent duplicate alarms while the same person remains in the scene.
|
||||
- Prevent immediate repeated alarms after a short leave-and-return event.
|
||||
|
||||
### 2.2 Non-Goals
|
||||
|
||||
- Do not replace the existing person tracker implementation.
|
||||
- Do not introduce a new standalone face tracker.
|
||||
- Do not change face embedding extraction or gallery search behavior.
|
||||
- Do not merge detection, recognition, tracking, and alarming into one plugin.
|
||||
|
||||
## 3. Current State Review
|
||||
|
||||
### 3.1 What Already Exists
|
||||
|
||||
- `plugins/tracker/tracker_node.cpp`
|
||||
- assigns stable `track_id` values to person detections in `frame->det`
|
||||
- `plugins/logic_gate/logic_gate_node.cpp`
|
||||
- already consumes person `track_id` for shoe-related reasoning
|
||||
- `plugins/ai_face_det/*` and `plugins/ai_face_recog/ai_face_recog_node.cpp`
|
||||
- already produce face detection and recognition results
|
||||
- `plugins/alarm/alarm_node.cpp`
|
||||
- already supports face-specific rules and the existing alarm action chain
|
||||
|
||||
### 3.2 Current Gap
|
||||
|
||||
The current face path does not actually reuse person tracking:
|
||||
|
||||
- `FaceDetItem` has a `track_id` field, but face detectors currently fill it with `-1`
|
||||
- `FaceRecogItem` only carries gallery identity fields such as `best_person_id`
|
||||
- face alarm vote keys currently use:
|
||||
- `best_person_id` for known-person rules
|
||||
- constant `unknown` for unknown-person rules
|
||||
|
||||
This means current face alarm behavior cannot answer:
|
||||
|
||||
- whether two frames belong to the same physical person
|
||||
- whether a temporary low-score frame belongs to a person already recognized moments earlier
|
||||
- whether an alarm should be suppressed because the person is still on screen
|
||||
|
||||
## 4. Design Summary
|
||||
|
||||
The design adds a track-aware identity aggregation layer on top of existing person tracking.
|
||||
|
||||
The revised behavior is:
|
||||
|
||||
1. Person detections continue to receive `track_id` from the existing tracker.
|
||||
2. Each recognized face is associated with one tracked person in the same frame.
|
||||
3. The associated `person_track_id` is stored on face recognition results.
|
||||
4. Alarm logic aggregates recognition evidence per `person_track_id`.
|
||||
5. The alarm decision becomes three-state:
|
||||
- `known`
|
||||
- `unknown`
|
||||
- `uncertain`
|
||||
6. Only stable `known` or stable `unknown` states may trigger alarms.
|
||||
7. `uncertain` observations are ignored.
|
||||
|
||||
This preserves the existing DAG architecture and keeps responsibilities separated:
|
||||
|
||||
- tracker tracks persons
|
||||
- face plugins produce recognition evidence
|
||||
- alarm node owns business-facing identity confirmation and deduplication
|
||||
|
||||
## 5. Data Flow
|
||||
|
||||
The intended runtime path becomes:
|
||||
|
||||
`person_det -> tracker -> face_det -> face_recog(face-person association) -> alarm(track-aware face rules) -> actions`
|
||||
|
||||
### 5.1 Face-to-Person Association
|
||||
|
||||
For each recognized face in a frame, associate it to one person detection from `frame->det` that already has a valid `track_id`.
|
||||
|
||||
Recommended matching order:
|
||||
|
||||
1. Prefer person boxes that contain the face center point.
|
||||
2. If multiple person boxes qualify, choose the one with the highest overlap quality.
|
||||
3. If no containing person box exists, optionally fall back to IoU / overlap ratio matching.
|
||||
4. If no reliable match exists, leave the face unassociated.
|
||||
|
||||
This keeps the matching logic simple and aligned with the workshop camera scenario, where one face should usually lie inside one person box.
|
||||
|
||||
### 5.2 Result Enrichment
|
||||
|
||||
Extend face recognition results with the associated person track metadata.
|
||||
|
||||
Recommended additions to `FaceRecogItem`:
|
||||
|
||||
- `int person_track_id = -1`
|
||||
- optional future field: `float person_match_score`
|
||||
|
||||
This lets downstream plugins consume face identity evidence with person continuity information, without coupling them to raw person detections.
|
||||
|
||||
## 6. Identity State Model
|
||||
|
||||
Per `person_track_id`, maintain a short-lived identity aggregation state in the alarm node.
|
||||
|
||||
Recommended tracked fields:
|
||||
|
||||
- `track_id`
|
||||
- `first_seen_ms`
|
||||
- `last_seen_ms`
|
||||
- `last_quality_pass_ms`
|
||||
- `best_known_person_id`
|
||||
- `best_known_name`
|
||||
- `best_sim_peak`
|
||||
- `best_known_hit_count`
|
||||
- `quality_pass_count`
|
||||
- `unknown_candidate_count`
|
||||
- `reported_known`
|
||||
- `reported_unknown`
|
||||
- `last_report_ms`
|
||||
|
||||
The state exists only while the track is active, plus a short retention window needed for re-entry suppression.
|
||||
|
||||
## 7. Three-State Decision Model
|
||||
|
||||
### 7.1 States
|
||||
|
||||
Each person track may be in one of three states:
|
||||
|
||||
- `uncertain`
|
||||
- insufficient quality or insufficient evidence
|
||||
- `known`
|
||||
- stable evidence for one known gallery identity
|
||||
- `unknown`
|
||||
- stable evidence that the tracked person is not matching known identities
|
||||
|
||||
### 7.2 Why `uncertain` Is Required
|
||||
|
||||
`uncertain` is the key change for this scenario.
|
||||
|
||||
Examples that should remain `uncertain`:
|
||||
|
||||
- face too small
|
||||
- poor alignment
|
||||
- temporary blur
|
||||
- far-distance observations
|
||||
- unstable similarity fluctuations
|
||||
- too few valid observations for the current track
|
||||
|
||||
These observations should not generate any identity alarm.
|
||||
|
||||
## 8. Quality Gating
|
||||
|
||||
Only quality-qualified face observations should participate in identity aggregation.
|
||||
|
||||
Recommended quality checks:
|
||||
|
||||
- associated `person_track_id >= 0`
|
||||
- face area ratio above configured minimum
|
||||
- face aspect ratio within configured bounds
|
||||
- landmarks available when alignment is required
|
||||
- optional minimum bbox size in pixels
|
||||
- optional minimum confidence from face detection
|
||||
|
||||
If a frame fails quality gating:
|
||||
|
||||
- do not count it toward `unknown`
|
||||
- do not count it toward `known`
|
||||
- keep the track state as `uncertain`
|
||||
|
||||
This directly matches the workshop requirement: low-quality data can be ignored.
|
||||
|
||||
## 9. Known-Person Confirmation
|
||||
|
||||
Known-person confirmation should require repeated evidence for the same gallery identity on the same tracked person.
|
||||
|
||||
Recommended conditions:
|
||||
|
||||
- face passed quality gating
|
||||
- `best_person_id >= 0`
|
||||
- `best_sim >= known_accept`
|
||||
- `(best_sim - second_sim) >= known_margin`
|
||||
- same `best_person_id` observed at least `known_min_hits` times inside `known_hit_window_ms`
|
||||
|
||||
Optional improvement:
|
||||
|
||||
- allow a peak-sim shortcut when `best_sim` is very high and consistent
|
||||
|
||||
Once the track reaches stable `known`:
|
||||
|
||||
- trigger `known_person`
|
||||
- mark the track as `reported_known`
|
||||
- suppress all later known alarms for the same active track
|
||||
|
||||
## 10. Unknown-Person Confirmation
|
||||
|
||||
Unknown-person confirmation must be stricter than known-person confirmation.
|
||||
|
||||
Unknown should not mean:
|
||||
|
||||
- "this frame is not known"
|
||||
|
||||
Unknown should mean:
|
||||
|
||||
- "this tracked person has been observed long enough, at sufficient quality, and still cannot be confirmed as any known person"
|
||||
|
||||
Recommended conditions:
|
||||
|
||||
- face passed quality gating
|
||||
- valid `person_track_id`
|
||||
- track age exceeds `unknown_min_track_age_ms`
|
||||
- quality-qualified observations reach `unknown_min_quality_hits`
|
||||
- no stable known identity has been confirmed for this track
|
||||
- recognition remains below known confirmation thresholds during the window
|
||||
|
||||
Optional additional conditions:
|
||||
|
||||
- require the top candidate identity to remain inconsistent
|
||||
- require multiple low-confidence or ambiguous frames before final unknown confirmation
|
||||
|
||||
Once the track reaches stable `unknown`:
|
||||
|
||||
- trigger `unknown_face`
|
||||
- mark the track as `reported_unknown`
|
||||
- suppress all later unknown alarms for the same active track
|
||||
|
||||
## 11. Alarm Deduplication and Re-Entry Control
|
||||
|
||||
### 11.1 Active-Track Deduplication
|
||||
|
||||
Within one active `person_track_id`:
|
||||
|
||||
- `known_person` may trigger at most once
|
||||
- `unknown_face` may trigger at most once
|
||||
|
||||
### 11.2 Re-Entry Suppression
|
||||
|
||||
The workshop scenario treats known-person alarms as punch events. Therefore:
|
||||
|
||||
- if the same known employee remains on screen, do not re-alarm
|
||||
- if the same known employee briefly leaves and re-enters, do not re-alarm immediately
|
||||
|
||||
Recommended suppression keys:
|
||||
|
||||
- known person: keyed by `gallery person_id`
|
||||
- unknown person: keyed by recent track history or a future stronger fingerprint
|
||||
|
||||
Recommended timers:
|
||||
|
||||
- `known_reentry_cooldown_ms`
|
||||
- `unknown_reentry_cooldown_ms`
|
||||
|
||||
Known-person suppression should be relatively long, because attendance punching should be sparse.
|
||||
|
||||
## 12. Configuration Design
|
||||
|
||||
Introduce a dedicated face track aggregation config section under the alarm face-rule path or a sibling alarm section.
|
||||
|
||||
Recommended fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"face_track_aggregation": {
|
||||
"enable": true,
|
||||
"associate_with_person_track": true,
|
||||
"require_person_track": true,
|
||||
"person_match_mode": "face_center_in_person",
|
||||
"person_match_min_iou": 0.05,
|
||||
"quality": {
|
||||
"min_face_area_ratio": 0.001,
|
||||
"min_face_width": 32,
|
||||
"min_face_height": 32,
|
||||
"require_landmarks": true
|
||||
},
|
||||
"known": {
|
||||
"accept": 0.45,
|
||||
"margin": 0.05,
|
||||
"min_hits": 3,
|
||||
"hit_window_ms": 3000,
|
||||
"reentry_cooldown_ms": 300000
|
||||
},
|
||||
"unknown": {
|
||||
"min_track_age_ms": 2000,
|
||||
"min_quality_hits": 4,
|
||||
"reentry_cooldown_ms": 300000
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- exact placement may be adjusted to match current config conventions
|
||||
- existing face rule fields should remain supported where practical
|
||||
- migration should minimize breaking existing configs
|
||||
|
||||
## 13. File-Level Changes
|
||||
|
||||
### 13.1 Data Model
|
||||
|
||||
Modify:
|
||||
|
||||
- `include/face/face_result.h`
|
||||
|
||||
Changes:
|
||||
|
||||
- add `person_track_id` to `FaceRecogItem`
|
||||
- optionally add future-friendly metadata for association confidence
|
||||
|
||||
### 13.2 Face Recognition Node
|
||||
|
||||
Modify:
|
||||
|
||||
- `plugins/ai_face_recog/ai_face_recog_node.cpp`
|
||||
|
||||
Changes:
|
||||
|
||||
- associate each recognized face to a tracked person from `frame->det`
|
||||
- write `person_track_id` into `FaceRecogItem`
|
||||
- extend debug log output to include `person_track_id`
|
||||
|
||||
### 13.3 Alarm Node
|
||||
|
||||
Modify:
|
||||
|
||||
- `plugins/alarm/alarm_node.cpp`
|
||||
|
||||
Changes:
|
||||
|
||||
- add track-aware face identity aggregation state
|
||||
- replace per-frame unknown alarm behavior with track-based unknown confirmation
|
||||
- change face vote key logic to prefer `person_track_id`
|
||||
- add deduplication and re-entry suppression based on track-aware identity state
|
||||
|
||||
### 13.4 Tests
|
||||
|
||||
Modify or add:
|
||||
|
||||
- platform-independent unit tests for association logic
|
||||
- platform-independent unit tests for track-aware known confirmation
|
||||
- platform-independent unit tests for track-aware unknown suppression
|
||||
- platform-independent unit tests for re-entry cooldown behavior
|
||||
|
||||
## 14. Compatibility and Migration
|
||||
|
||||
The design should be introduced in a backward-aware way.
|
||||
|
||||
Recommended compatibility strategy:
|
||||
|
||||
1. keep existing face recognition output fields unchanged
|
||||
2. add new fields instead of renaming old ones
|
||||
3. keep current config behavior available when track-aware aggregation is disabled
|
||||
4. allow current face rules to coexist with the new aggregation mode during rollout
|
||||
|
||||
This enables:
|
||||
|
||||
- safer staged rollout
|
||||
- easier comparison between old and new behavior
|
||||
- simpler troubleshooting on RK3588
|
||||
|
||||
## 15. Validation Strategy
|
||||
|
||||
### 15.1 Local Code-Level Validation
|
||||
|
||||
Local validation should focus on platform-independent logic only:
|
||||
|
||||
- face-to-person association behavior
|
||||
- aggregation state transitions
|
||||
- known confirmation window logic
|
||||
- unknown suppression logic
|
||||
- re-entry cooldown behavior
|
||||
- config parsing and backward compatibility
|
||||
|
||||
### 15.2 RK3588 Device-Side Validation
|
||||
|
||||
Final validation must be completed on RK3588:
|
||||
|
||||
- known person from far to near
|
||||
- expect delayed but stable known-person alarm
|
||||
- no unknown false alarm
|
||||
- known person brief leave and quick re-entry
|
||||
- expect no repeated punch alarm
|
||||
- known person long leave and re-entry after cooldown
|
||||
- expect one new punch alarm
|
||||
- truly unknown person with adequate face quality
|
||||
- expect one unknown alarm after evidence accumulation
|
||||
- low-quality unknown face
|
||||
- expect no alarm
|
||||
- multiple persons in frame
|
||||
- verify face-person association uses the correct person track
|
||||
|
||||
## 16. Risks and Mitigations
|
||||
|
||||
### Risk 1: Incorrect face-person association
|
||||
|
||||
Impact:
|
||||
|
||||
- identity evidence may be attached to the wrong tracked person
|
||||
|
||||
Mitigation:
|
||||
|
||||
- start with simple center-in-box matching
|
||||
- log `person_track_id` and association decisions in debug mode
|
||||
- validate on multi-person RK3588 scenes
|
||||
|
||||
### Risk 2: Unknown confirmation becomes too conservative
|
||||
|
||||
Impact:
|
||||
|
||||
- unknown alarms may be delayed or reduced
|
||||
|
||||
Mitigation:
|
||||
|
||||
- make unknown thresholds configurable
|
||||
- prefer under-reporting over false workshop alerts in early rollout
|
||||
|
||||
### Risk 3: Re-entry suppression too aggressive
|
||||
|
||||
Impact:
|
||||
|
||||
- valid repeated attendance events may be skipped
|
||||
|
||||
Mitigation:
|
||||
|
||||
- make re-entry cooldown configurable
|
||||
- document business interpretation clearly as punch-style attendance
|
||||
|
||||
## 17. Rollout Recommendation
|
||||
|
||||
Recommended rollout order:
|
||||
|
||||
1. add `person_track_id` propagation and debug logs
|
||||
2. add track-aware known confirmation
|
||||
3. add conservative track-aware unknown confirmation
|
||||
4. add re-entry suppression tuning
|
||||
5. validate behavior on RK3588 with known and unknown workshop videos
|
||||
|
||||
This staged rollout reduces risk and allows behavior comparison at each step.
|
||||
|
||||
## 18. Expected Outcome
|
||||
|
||||
After this design is implemented, the system should behave as follows in the workshop face-recognition scenario:
|
||||
|
||||
- poor-quality face observations are ignored
|
||||
- the same known employee is confirmed from accumulated evidence, not single-frame luck
|
||||
- transient low-score frames do not become stranger alarms
|
||||
- a person who stays in scene triggers at most one identity alarm
|
||||
- short leave-and-return behavior does not trigger repeated punch alarms
|
||||
- stranger alarms become rarer but more trustworthy
|
||||
|
||||
This is the intended trade-off for the workshop deployment: lower alarm frequency and higher alarm precision.
|
||||
Loading…
Reference in New Issue
Block a user