Default to optimal confidence from model-eval#2206
Conversation
a19cb78 to
5660d57
Compare
32fa48d to
473dd08
Compare
33ea1b0 to
262b5bf
Compare
ba6acfd to
65e7aad
Compare
| ) | ||
| ) | ||
| if confidence_filter.has_per_class_refinement: | ||
| results = [ |
There was a problem hiding this comment.
This one needs to stay as a second loop since post_process_predictions_for_precomputed_embeddings is also used by OWLv2
581382e to
da83b8e
Compare
|
Changes:
|
| call when this is False.""" | ||
| return self._per_class is not None | ||
|
|
||
| def passes(self, class_name: str, confidence: float) -> bool: |
There was a problem hiding this comment.
is passes(...) actually used outside?
also - the name is misleading - it looks like it should return false if confidence is below global floor, regardless of the self._per_class presence
if not used - I am voting to turn into private helper
| if global_optimal is not None | ||
| else default_confidence | ||
| ) | ||
| self._per_class = dict(per_class) |
There was a problem hiding this comment.
is dict(...) needed?
| 3. Global optimal — single threshold for everything | ||
| 4. Model's hardcoded default — single threshold for everything | ||
|
|
||
| Exposes: |
There was a problem hiding this comment.
try to reduce comments that express interface meaning in natural language - interface should be interpretable easily looking at the code or redesigned such that the reader builds good intuitions just looking at the signarures
| also handles the no-refinement case (returns all-True).""" | ||
| n = len(class_ids) | ||
| if not self.has_per_class_refinement: | ||
| return torch.ones(n, dtype=torch.bool) |
There was a problem hiding this comment.
retracted
still haven't consumed all the PR, but why wouldn't just shortcut with confidences >= torch.full_like(confidences, self._fallback)
name of the method requires additional clarification in the comment increasing the cognitive load
There was a problem hiding this comment.
I would make this private helper - maybe not worth creating mask and filtering with this mask when will be all True and simply always return original object?
| class_id=detections.class_id[keep], | ||
| confidence=detections.confidence[keep], | ||
| image_metadata=detections.image_metadata, | ||
| bboxes_metadata=detections.bboxes_metadata, |
There was a problem hiding this comment.
looks like bboxes_metadata not filtered?
maybe we should re-use other method?
| bboxes_metadata=bboxes_metadata, | ||
| ) | ||
|
|
||
| def refine_keypoints_and_detections( |
There was a problem hiding this comment.
from what I remember Detections production is optional for keypoints models - we shall probably reflect that, or let user decide to refine separately
| ) | ||
| return refined_keypoints, refined_detections | ||
|
|
||
| def refine_multilabel_prediction( |
There was a problem hiding this comment.
retracted
use of passes(...) which only works with per-class refinement
| image_metadata=prediction.image_metadata, | ||
| ) | ||
|
|
||
| def refine_segmentation_result( |
There was a problem hiding this comment.
retracted
this method looks aligned to what I would expect (auto-fallback even if client does not care about has_per_class_refinement) and at the same time not aligned to other methods creating inconsistency
There was a problem hiding this comment.
maybe we should check if class alignment is there and only apply if present?
| safe_idx = class_ids_long.clamp(0, max(len(class_names) - 1, 0)) | ||
| per_detection_thresholds = torch.where( | ||
| in_range, | ||
| thresholds_per_class[safe_idx] if len(class_names) > 0 else torch.full_like(confidences, self._fallback), |
There was a problem hiding this comment.
maybe empty class names deserves shortcut earlier in the logic?
| ) | ||
| return confidences >= per_detection_thresholds | ||
|
|
||
| def per_class_thresholds(self, class_names: List[str]) -> List[float]: |
There was a problem hiding this comment.
maybe tensor should be returned from the function?
There was a problem hiding this comment.
maybe also private helper?
| recommended_parameters: Optional[RecommendedParameters], | ||
| default_confidence: float, | ||
| ): | ||
| # Tier 1: explicit user value wins outright. No per-class refinement |
There was a problem hiding this comment.
this is just sanitisation, optional - maybe a helper function to establish all values we want to set and set the state of the class based on that - this way when any field needs to be added, it's easier not to get lost in a jungle of if-elif-else
| # existing per-image loop when has_per_class_refinement is True. | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def refine_detections( |
There was a problem hiding this comment.
looks like the true public interface is refine_detections, refine_instance_detections, refine_keypoints_and_detections, refine_multilabel_prediction, refine_segmentation_result - would keep them on top of the class
| return image_bboxes, masks | ||
|
|
||
|
|
||
| class ConfidenceFilter: |
There was a problem hiding this comment.
looking at the code over and over and still have feeling that this does not fully match the puzzles.
When I see ConfidenceFilter - I believe the responsibility of this class is generating and applying filtering criteria based on confidence. And I believe this gut feeling is fair.
Current implementations are trivial (comparing tensor to constant float value)
Examples:
# rfdetr OD
confidence_mask = predicted_confidence > confidence
# rfdetr IS
confidence_mask = confidence > threshold
# resnet ML
batch_element_confidence >= confidence
# resnet single lablel: N/A
# YOLO IS
mask = class_conf > conf_thresh
Proposition in this class is to extend the logic to be confidence filter on steroids, which basically
- requires to run old style filtering (but based on this class dictated
floorvalue) - and then run another construction of tensors in the output entities
- at the same time entangling knowledge about internals of data formats outputed from the model classes at the end of whole forward pass - which do not need to be known by
ConfidenceFilter
we end up with:
- worst-case scenario double filtering which is avoidable (performance loss)
- not natural interactions between model and this class
- class interface which is not generic and requires mutation for future variations of output entities.
Let's discuss if you find thsoe correct observations - maybe I lack some visibility, maybe this is for some reason not possible.
cdbc2bd to
e688ae7
Compare
e688ae7 to
341300d
Compare
What does this PR do?
Model eval calculates F1-optimal confidence thresholds but they aren't currently used for model inference. This PR together with https://github.com/roboflow/roboflow/pull/11053 makes it so. This feature applies only to the inference_models pathway. The legacy inference pathways keep their existing default confidence.
The key changes are
Wire recommendedParameters from model_eval through to inference
New ConfidenceFilter and post_process_with_confidence_filter() wrapper (inference_models/models/base/)
Inference adapters route through the new wrapper (inference/core/models/inference_models_adapters.py)
API request schema (inference/core/entities/requests/inference.py)
OLD inference path compatibility (inference/core/models/{object_detection,instance_segmentation,classification}_base.py)
Plumbing
Dependencies:
Type of Change
Testing
Test details:
New unit tests: test_confidence_filter.py, test_confidence_filter_attribute.py, test_post_process_filter.py, test_recommended_parameters.py, plus expanded test_roboflow.py and test_core.py coverage
Tested rfdetr OD workflow in staging against local inference server without https://github.com/roboflow/roboflow/pull/11053 deployed - verify the hard-coded default is still in effect even if the API doesn't yet serve the
recommendedParametersDebug logging:
Debug logging:
Debug logging:
Checklist
Additional Context