Skip to content

DataFrame API interchange protocol as an input#1587

Draft
KulikovNikita wants to merge 24 commits intouxlfoundation:mainfrom
KulikovNikita:dev/interchange-api-impl
Draft

DataFrame API interchange protocol as an input#1587
KulikovNikita wants to merge 24 commits intouxlfoundation:mainfrom
KulikovNikita:dev/interchange-api-impl

Conversation

@KulikovNikita
Copy link
Copy Markdown
Contributor

Depends on #1568

@KulikovNikita KulikovNikita force-pushed the dev/interchange-api-impl branch 2 times, most recently from 66a6d55 to 48b465d Compare November 24, 2023 09:15
Comment thread onedal/cluster/kmeans.py
centers = np.asarray(init)
assert centers.shape[0] == n_clusters
assert centers.shape[1] == X_table.column_count
assert centers.shape[1] == X_table.get_column_count()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not pythonic like method, api

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, you obviously can do it via python properties. In the other hand - in such a case you will loose naming consistency with C++

Comment thread onedal/interop/array.py
Comment on lines +87 to +92
def to_common_policy(*xs) -> tuple:
if _are_equal_devices(xs):
return xs
else:
policy = _HostInteropPolicy()
return (x.to_policy(policy) for x in xs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be discussed and implemented properly.
In anyway, this should be added/updated in separate PR with separate review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on functionality implemented in this PR. It depends on user-facing interface of arrays, while CSR table depends on being able to move data effortlessly from one device to another

@samir-nasibli samir-nasibli marked this pull request as ready for review December 6, 2023 17:05
@samir-nasibli samir-nasibli marked this pull request as draft December 6, 2023 17:05
@samir-nasibli
Copy link
Copy Markdown
Contributor

It would be fine to have some mem leak tests as well.
It worth to check how python handles onedal mem.

Copy link
Copy Markdown
Contributor

@samir-nasibli samir-nasibli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially reviewed.

dal_dtype = get_data_type(dtype)
result.append(dal_dtype)
result = np.asarray(result)
result = result.astype(np.int32)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OneDAL types int32 format

result = result.astype(np.int32)
return to_array(result)

# TODO: implement logic supporting
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update this TODO to address codefactor hint

Comment on lines +1 to +3
"""
A (vendored) copy from
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to be updated

METAL = 8
VPI = 9
ROCM = 10
ONEAPI = 14
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file has some additions to original py



def is_python_dataframe(entity) -> bool:
namespace = hasattr(entity, dataframe_namespace)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace = hasattr(entity, dataframe_namespace)
is_namespace = hasattr(entity, dataframe_namespace)

Comment on lines +45 to +55
def to_heterogen_table_python(entity):
assert is_python_dataframe(entity)
if hasattr(entity, dataframe):
temp = entity.__dataframe__()
elif hasattr(entity, dataframe_standard):
temp = entity.__dataframe_standard__()
elif hasattr(entity, dataframe_namespace):
temp = entity
else:
raise TypeError("Expected DataFrame")
return build_from_dataframe(temp)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces extensive interoperability enhancements for the DataFrame API interchange protocol. Key changes include:

  • New interop modules for handling DataFrame, array, DLPack, and SUA protocols.
  • Comprehensive additions of utility functions, wrappers, and builders to support conversion among native, DLPack, and Python-based data structures.
  • A suite of new tests and supporting modules to validate the new protocols and conversion mechanisms.

Reviewed Changes

Copilot reviewed 130 out of 130 changed files in this pull request and generated 2 comments.

File Description
onedal/interop/sua/* Added SUA interop utilities and wrappers for homogen table conversion.
onedal/interop/dlpack/* Implemented full DLPack support including API, dtype conversion, and device conversion.
onedal/interop/dataframes/* Introduced dataframe protocol definitions, builders, and dtype conversions.
onedal/interop/tests/* Added tests and wrapper utilities for validating interop functionality.
Comments suppressed due to low confidence (1)

onedal/interop/dlpack/device_conversion.cpp:99

  • [nitpick] Consider adding comments or enhanced error handling here regarding sycl device conversion failures, to clarify behavior when conversion does not succeed.
        auto dev = convert_to_sycl(device);

return self.body.__dlpack_device__()


# It is sthe easiest way to get a purely
Copy link

Copilot AI Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: 'sthe' should be 'the'.

Suggested change
# It is sthe easiest way to get a purely
# It is the easiest way to get a purely

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +93
object.dec_ref();
};
auto result = wrap_to_homogen_table( //
std::move(sua),
std::move(deleter));
object.inc_ref();
Copy link

Copilot AI Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual reference count adjustments using inc_ref/dec_ref are error‑prone. Consider using RAII wrappers or pybind11’s built‑in reference management to ensure exception safety.

Suggested change
object.dec_ref();
};
auto result = wrap_to_homogen_table( //
std::move(sua),
std::move(deleter));
object.inc_ref();
// Reference count adjustments are handled automatically by pybind11.
};
auto result = wrap_to_homogen_table( //
std::move(sua),
std::move(deleter));

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants