Skip to content

bug: Properly handle Union in file-api#1272

Open
LecrisUT wants to merge 2 commits intoscikit-build:mainfrom
LecrisUT:refactor/file-api-union
Open

bug: Properly handle Union in file-api#1272
LecrisUT wants to merge 2 commits intoscikit-build:mainfrom
LecrisUT:refactor/file-api-union

Conversation

@LecrisUT
Copy link
Copy Markdown
Collaborator

Previously only the first type was considered, but evidently that can potentially fail. Seems like this would only happen in one case

paths: List[Union[str, Paths]] = dataclasses.field(default_factory=list)

Signed-off-by: Cristian Le <git@lecris.dev>
Signed-off-by: Cristian Le <git@lecris.dev>
Comment on lines -60 to +63
codemodel_v2: Optional[CodeModel]
cache_v2: Optional[Cache]
cmakefiles_v1: Optional[CMakeFiles]
toolchains_v1: Optional[Toolchains]
codemodel_v2: Optional[CodeModel] = None
cache_v2: Optional[Cache] = None
cmakefiles_v1: Optional[CMakeFiles] = None
toolchains_v1: Optional[Toolchains] = None
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this part is also touching the logic of Optional I think we should check what exactly should be done here?

These very much depend on the query used. Any suggestion on how these should be handled ideally?

One thing I would want is to drop the _v suffix there and make it completely version independent, but technically I think they are allowed to have multiple such fields. I think we could fix a specific query format that is allowed and that we parse

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the _v1 actually part of the layout?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we can still use the rename for that. It is mostly cosmetic for us.

The bigger question is if we should assume there is only one variant. We should use the stateful queries and then we have our own namespace and exact control of the format we request and lock the query-reply format

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves the file_api reply JSON-to-dataclass conversion so it can handle Union-typed fields more robustly, and makes Reply fields optional by default to tolerate missing keys in index.json.

Changes:

  • Update Converter._convert_any to detect Union targets and attempt conversion across multiple union members.
  • Add typing helpers (process_union, is_union_type, get_target_raw_type) usage to normalize and inspect target types during conversion.
  • Set default None values for Reply fields so missing reply entries don’t cause dataclass construction failures.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/scikit_build_core/file_api/reply.py Extends conversion logic to handle Union targets (and refactors list handling).
src/scikit_build_core/file_api/model/index.py Gives Reply optional fields default None values to allow missing keys.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +105 to +115
raw_target = get_target_raw_type(target)
# For generic Unions we try each type on at a time
if is_union_type(raw_target):
last_err: Exception = AssertionError("Failed for unknown reason")
for maybe_target in get_args(target):
try:
return self._convert_any(item, maybe_target)
except ExceptionGroup as err: # noqa: PERF203
last_err = err
continue
raise last_err
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new Union-conversion behavior that isn’t exercised by the current file-api fixtures (the checked-in reply JSON only has string entries for InstallRule.paths). Adding a focused test/fixture where paths contains an object (the Paths shape) would help ensure the Union selection logic picks Paths over str and prevent regressions.

Copilot uses AI. Check for mistakes.
for maybe_target in get_args(target):
try:
return self._convert_any(item, maybe_target)
except ExceptionGroup as err: # noqa: PERF203
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Union handling loop only catches ExceptionGroup. Conversions can fail with plain TypeError (e.g., from target(item) or make_class raising TypeError for missing/invalid args), which will abort the Union handling without trying the remaining alternatives. Consider catching TypeError as well (and using the same fallback/last-error logic) so later Union members actually get attempted.

Suggested change
except ExceptionGroup as err: # noqa: PERF203
except (ExceptionGroup, TypeError) as err: # noqa: PERF203

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +115
raw_target = get_target_raw_type(target)
# For generic Unions we try each type on at a time
if is_union_type(raw_target):
last_err: Exception = AssertionError("Failed for unknown reason")
for maybe_target in get_args(target):
try:
return self._convert_any(item, maybe_target)
except ExceptionGroup as err: # noqa: PERF203
last_err = err
continue
raise last_err
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying Union members in order and returning on the first non-throwing conversion can still produce incorrect results for Union[str, Paths] (used by InstallRule.paths): str(<dict>) succeeds, so dict entries will be converted to a string repr instead of the Paths dataclass. To make this robust, add a shape/type guard before attempting a conversion (e.g., only treat str as valid if isinstance(item, str), and prefer dataclass targets when isinstance(item, dict)), rather than relying on exceptions to pick the right Union member.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants