DataFrame API interchange protocol as an input#1587
DataFrame API interchange protocol as an input#1587KulikovNikita wants to merge 24 commits intouxlfoundation:mainfrom
Conversation
66a6d55 to
48b465d
Compare
48b465d to
1148782
Compare
| centers = np.asarray(init) | ||
| assert centers.shape[0] == n_clusters | ||
| assert centers.shape[1] == X_table.column_count | ||
| assert centers.shape[1] == X_table.get_column_count() |
There was a problem hiding this comment.
not pythonic like method, api
There was a problem hiding this comment.
Yeap, you obviously can do it via python properties. In the other hand - in such a case you will loose naming consistency with C++
| def to_common_policy(*xs) -> tuple: | ||
| if _are_equal_devices(xs): | ||
| return xs | ||
| else: | ||
| policy = _HostInteropPolicy() | ||
| return (x.to_policy(policy) for x in xs) |
There was a problem hiding this comment.
Should be discussed and implemented properly.
In anyway, this should be added/updated in separate PR with separate review
There was a problem hiding this comment.
I guess it depends on functionality implemented in this PR. It depends on user-facing interface of arrays, while CSR table depends on being able to move data effortlessly from one device to another
|
It would be fine to have some mem leak tests as well. |
samir-nasibli
left a comment
There was a problem hiding this comment.
Partially reviewed.
| dal_dtype = get_data_type(dtype) | ||
| result.append(dal_dtype) | ||
| result = np.asarray(result) | ||
| result = result.astype(np.int32) |
There was a problem hiding this comment.
OneDAL types int32 format
| result = result.astype(np.int32) | ||
| return to_array(result) | ||
|
|
||
| # TODO: implement logic supporting |
There was a problem hiding this comment.
Let's update this TODO to address codefactor hint
| """ | ||
| A (vendored) copy from | ||
| """ |
| METAL = 8 | ||
| VPI = 9 | ||
| ROCM = 10 | ||
| ONEAPI = 14 |
There was a problem hiding this comment.
This file has some additions to original py
|
|
||
|
|
||
| def is_python_dataframe(entity) -> bool: | ||
| namespace = hasattr(entity, dataframe_namespace) |
There was a problem hiding this comment.
| namespace = hasattr(entity, dataframe_namespace) | |
| is_namespace = hasattr(entity, dataframe_namespace) |
| def to_heterogen_table_python(entity): | ||
| assert is_python_dataframe(entity) | ||
| if hasattr(entity, dataframe): | ||
| temp = entity.__dataframe__() | ||
| elif hasattr(entity, dataframe_standard): | ||
| temp = entity.__dataframe_standard__() | ||
| elif hasattr(entity, dataframe_namespace): | ||
| temp = entity | ||
| else: | ||
| raise TypeError("Expected DataFrame") | ||
| return build_from_dataframe(temp) |
There was a problem hiding this comment.
There was a problem hiding this comment.
There was a problem hiding this comment.
Pull Request Overview
This PR introduces extensive interoperability enhancements for the DataFrame API interchange protocol. Key changes include:
- New interop modules for handling DataFrame, array, DLPack, and SUA protocols.
- Comprehensive additions of utility functions, wrappers, and builders to support conversion among native, DLPack, and Python-based data structures.
- A suite of new tests and supporting modules to validate the new protocols and conversion mechanisms.
Reviewed Changes
Copilot reviewed 130 out of 130 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| onedal/interop/sua/* | Added SUA interop utilities and wrappers for homogen table conversion. |
| onedal/interop/dlpack/* | Implemented full DLPack support including API, dtype conversion, and device conversion. |
| onedal/interop/dataframes/* | Introduced dataframe protocol definitions, builders, and dtype conversions. |
| onedal/interop/tests/* | Added tests and wrapper utilities for validating interop functionality. |
Comments suppressed due to low confidence (1)
onedal/interop/dlpack/device_conversion.cpp:99
- [nitpick] Consider adding comments or enhanced error handling here regarding sycl device conversion failures, to clarify behavior when conversion does not succeed.
auto dev = convert_to_sycl(device);
| return self.body.__dlpack_device__() | ||
|
|
||
|
|
||
| # It is sthe easiest way to get a purely |
There was a problem hiding this comment.
Typo in comment: 'sthe' should be 'the'.
| # It is sthe easiest way to get a purely | |
| # It is the easiest way to get a purely |
| object.dec_ref(); | ||
| }; | ||
| auto result = wrap_to_homogen_table( // | ||
| std::move(sua), | ||
| std::move(deleter)); | ||
| object.inc_ref(); |
There was a problem hiding this comment.
Manual reference count adjustments using inc_ref/dec_ref are error‑prone. Consider using RAII wrappers or pybind11’s built‑in reference management to ensure exception safety.
| object.dec_ref(); | |
| }; | |
| auto result = wrap_to_homogen_table( // | |
| std::move(sua), | |
| std::move(deleter)); | |
| object.inc_ref(); | |
| // Reference count adjustments are handled automatically by pybind11. | |
| }; | |
| auto result = wrap_to_homogen_table( // | |
| std::move(sua), | |
| std::move(deleter)); |
Depends on #1568