refactor(ocr): reduce OCRDetection bounding box from 4 vertices to 2-point AABB#1126
Closed
msluszniak wants to merge 3 commits intosoftware-mansion:mainfrom
Closed
refactor(ocr): reduce OCRDetection bounding box from 4 vertices to 2-point AABB#1126msluszniak wants to merge 3 commits intosoftware-mansion:mainfrom
msluszniak wants to merge 3 commits intosoftware-mansion:mainfrom
Conversation
Consolidate npm update configuration and change schedule to monthly.
Resolves software-mansion#760. The OCR and VerticalOCR pipelines previously exposed all four rotated-rectangle corners in OCRDetection.bbox. Two points (top-left and bottom-right of the axis-aligned bounding box) are sufficient for downstream rendering and are simpler to consume. Changes: - Types.h: shrink OCRDetection.bbox from std::array<Point,4> to std::array<Point,2> - RecognitionHandler.cpp: compute AABB (min/max x,y) over the four detector corners instead of forwarding them verbatim - VerticalOCR.cpp: same AABB reduction in _processSingleTextBox - OCR.cpp / VerticalOCR.cpp generateFromFrame: re-normalize the two bbox corners after inverseRotatePoints to guarantee bbox[0] <= bbox[1] - JsiConversions.h: serialize 2 points instead of 4 to JavaScript - OCRTest.cpp / VerticalOCRTest.cpp: assert size==2 and that bbox[1] >= bbox[0] - ocr.ts: narrow TypeScript type from Point[] to [Point,Point] and update docs
12 tasks
Member
Author
|
Closing this in favour of #1130 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Reduces the
OCRDetectionbounding box from a 4-vertex rotated rectangle to a 2-point axis-aligned bounding box (AABB). The detector internally still uses 4 corners for cropping text regions; only the public-facingOCRDetectionoutput is changed —bboxis now[top-left, bottom-right]instead of 4 arbitrary corners. The TypeScript type is narrowed fromPoint[]to[Point, Point].Changes span
Types.h,RecognitionHandler.cpp,OCR.cpp,VerticalOCR.cpp,JsiConversions.h,ocr.ts, and the integration tests for both OCR and VerticalOCR.Introduces a breaking change?
OCRDetection.bboxis now a 2-element tuple. Any code indexingbbox[2]orbbox[3]must be updated.Type of change
Tested on
Testing instructions
DetectionsHaveValidBoundingBoxes/DetectionsHaveValidBBoxestests pass — bbox size == 2,bbox[0]≤bbox[1]on both axes.useOCR, logdetections[0].bbox— confirm it is[{x, y}, {x, y}]with exactly 2 elements.runOnFrame(VisionCamera) in portrait and landscape — confirm bounding boxes remain correctly oriented after rotation.Screenshots
N/A
Related issues
Closes #760
Checklist
Additional notes
DetectorBBox(internal, used for cropping) retains 4 points — only the publicOCRDetectionoutput is changed. ThegenerateFromFramepath re-normalises the 2 AABB corners afterinverseRotatePointsto guaranteebbox[0]is always top-left.