bug: Properly handle Union in file-api#1272
bug: Properly handle Union in file-api#1272LecrisUT wants to merge 2 commits intoscikit-build:mainfrom
Conversation
Signed-off-by: Cristian Le <git@lecris.dev>
Signed-off-by: Cristian Le <git@lecris.dev>
| codemodel_v2: Optional[CodeModel] | ||
| cache_v2: Optional[Cache] | ||
| cmakefiles_v1: Optional[CMakeFiles] | ||
| toolchains_v1: Optional[Toolchains] | ||
| codemodel_v2: Optional[CodeModel] = None | ||
| cache_v2: Optional[Cache] = None | ||
| cmakefiles_v1: Optional[CMakeFiles] = None | ||
| toolchains_v1: Optional[Toolchains] = None |
There was a problem hiding this comment.
Because this part is also touching the logic of Optional I think we should check what exactly should be done here?
These very much depend on the query used. Any suggestion on how these should be handled ideally?
One thing I would want is to drop the _v suffix there and make it completely version independent, but technically I think they are allowed to have multiple such fields. I think we could fix a specific query format that is allowed and that we parse
There was a problem hiding this comment.
Aren't the _v1 actually part of the layout?
There was a problem hiding this comment.
Yes, but we can still use the rename for that. It is mostly cosmetic for us.
The bigger question is if we should assume there is only one variant. We should use the stateful queries and then we have our own namespace and exact control of the format we request and lock the query-reply format
There was a problem hiding this comment.
Pull request overview
Improves the file_api reply JSON-to-dataclass conversion so it can handle Union-typed fields more robustly, and makes Reply fields optional by default to tolerate missing keys in index.json.
Changes:
- Update
Converter._convert_anyto detectUniontargets and attempt conversion across multiple union members. - Add typing helpers (
process_union,is_union_type,get_target_raw_type) usage to normalize and inspect target types during conversion. - Set default
Nonevalues forReplyfields so missing reply entries don’t cause dataclass construction failures.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/scikit_build_core/file_api/reply.py |
Extends conversion logic to handle Union targets (and refactors list handling). |
src/scikit_build_core/file_api/model/index.py |
Gives Reply optional fields default None values to allow missing keys. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| raw_target = get_target_raw_type(target) | ||
| # For generic Unions we try each type on at a time | ||
| if is_union_type(raw_target): | ||
| last_err: Exception = AssertionError("Failed for unknown reason") | ||
| for maybe_target in get_args(target): | ||
| try: | ||
| return self._convert_any(item, maybe_target) | ||
| except ExceptionGroup as err: # noqa: PERF203 | ||
| last_err = err | ||
| continue | ||
| raise last_err |
There was a problem hiding this comment.
This change introduces new Union-conversion behavior that isn’t exercised by the current file-api fixtures (the checked-in reply JSON only has string entries for InstallRule.paths). Adding a focused test/fixture where paths contains an object (the Paths shape) would help ensure the Union selection logic picks Paths over str and prevent regressions.
| for maybe_target in get_args(target): | ||
| try: | ||
| return self._convert_any(item, maybe_target) | ||
| except ExceptionGroup as err: # noqa: PERF203 |
There was a problem hiding this comment.
The Union handling loop only catches ExceptionGroup. Conversions can fail with plain TypeError (e.g., from target(item) or make_class raising TypeError for missing/invalid args), which will abort the Union handling without trying the remaining alternatives. Consider catching TypeError as well (and using the same fallback/last-error logic) so later Union members actually get attempted.
| except ExceptionGroup as err: # noqa: PERF203 | |
| except (ExceptionGroup, TypeError) as err: # noqa: PERF203 |
| raw_target = get_target_raw_type(target) | ||
| # For generic Unions we try each type on at a time | ||
| if is_union_type(raw_target): | ||
| last_err: Exception = AssertionError("Failed for unknown reason") | ||
| for maybe_target in get_args(target): | ||
| try: | ||
| return self._convert_any(item, maybe_target) | ||
| except ExceptionGroup as err: # noqa: PERF203 | ||
| last_err = err | ||
| continue | ||
| raise last_err |
There was a problem hiding this comment.
Trying Union members in order and returning on the first non-throwing conversion can still produce incorrect results for Union[str, Paths] (used by InstallRule.paths): str(<dict>) succeeds, so dict entries will be converted to a string repr instead of the Paths dataclass. To make this robust, add a shape/type guard before attempting a conversion (e.g., only treat str as valid if isinstance(item, str), and prefer dataclass targets when isinstance(item, dict)), rather than relying on exceptions to pick the right Union member.
Previously only the first type was considered, but evidently that can potentially fail. Seems like this would only happen in one case
scikit-build-core/src/scikit_build_core/file_api/model/directory.py
Line 25 in db72d46