parser: Check for null cursors from get_tokens#291
Conversation
9718068 to
e8485f3
Compare
|
Thanks for the fix, I can reproduce. The test suite fails all over the place running clang 21 (in e.g. Fedora Rawhide). Looks like this was introduced by @DeinAlptraum in llvm/llvm-project#138103. I honestly hate having to do
|
|
In particular, this |
|
Hi, I'm currently more or less the maintainer of the libclang-python bindings. We don't usually hear about how people use them (or if they use them at all) so I'm always happy to get feedback like this. Re: null cursor treatment We thus tried to fully go the "null Cursors are interpreted as
Yes, this absolutely shouldn't happen and is what we tried to avoid
This was what we were aiming for: all methods (properties) being safe and working as intended on actual |
Hey, thanks a lot for chiming in! We hit issues with libclang and/or the bindings every once in a while, so it's nice to know who to try to reach out to! 😉
The goal seems sane. I just edited #291 (comment) to add that the null Cursors appear to originate from
Currently, it's basically running I'm assuming |
|
Thanks, I was able to reproduce this. |
(Formal Python bindings maintainer here.) |
Thanks @DeinAlptraum for the quick fix!
Happy to get your attention too @Endilll. Do either of you have any suggestions on how to cleanly check for null cursors in a way that's compatible across Clang versions? As said, the Regarding backports, we might have to carry the workaround no matter what, in case someone ends up using Clang 21.x earlier than the fix, like current Fedora Rawhide (and I believe 43). But backporting would certainly help projects that have yet to hit the issue, and potentially avoid a bunch of debugging, before eventually making the same conclusions as @heftig and I. |
Unfortunately, I don't have any concrete suggestions. Moreover, I should say that after years of being mostly neglected, Python bindings will be changing towards more consistent API within a single version, at the expense of consistency across versions where necessary. (Whereas libclang C API will remain API- and ABI-stable, as was always promised.) I'm the one to drive this point, and I take the blame of making life harder for people who has to support multiple versions simultaneously. My hope is that the pain will subside over time as you drop support for older Clang versions, while you get rid of some technical debt that our inconsistent API imposed on you. We do our best to keep breakages to minimum, providing users with detailed changelog of potentially breaking changes, and even deprecation periods where it makes sense. The alternative is API that can be only piled onto without ever changing what's there, like the C one, and I don't like it. Even then the behavior of the functions can change in subtle ways, because of the changes to the underlying C++ API, so the only way to robustly shield yourself from changes is to freeze the version of Clang down to specific commit. I guess I went off-topic a bit, but that should give you an impression where Clang is headed at the moment. |
|
I don't exactly like suggesting the use of what feels like an implementation detail, but this is Python so it's not like we can stop people anyway, and the C-API should be stable... |
Sounds like another argument to support low-level API in Python bindings. |
|
@heftig I'm leaning towards doing something like this: The idea is two-fold. First, on Clang 22+ (and possibly 21.x if the fix is backported), it'll short-circuit on The |
Hawkmoth is broken with LLVM 21 release. This is already reported, but there is no stable fix on the Hawkmoth's size, yet. Thus this changes the used LLVM version in Hawkmoth to 20. jnikula/hawkmoth#291
Hawkmoth is broken with LLVM 21 release. This is already reported, but there is no stable fix on the Hawkmoth's size, yet. Thus this changes the used LLVM version in Hawkmoth to 20. jnikula/hawkmoth#291
|
@heftig Will you update the pull request, or shall I? |
Clang 21's cindex.py should be replacing all returned null cursors with
None. However, due to a bug (as of 21.1.3) it still returns null cursors
from tu.get_tokens.
We need to explicitly check for these because cindex.py 21 also raises
an exception when any property of a null cursor is accessed.
File "hawkmoth/parser.py", line 169, in _comment_extract
if token_cursor.kind in [CursorKind.INVALID_FILE,
^^^^^^^^^^^^^^^^^
File "clang/cindex.py", line 1578, in inner
raise Exception("Tried calling method on a null-cursor.")
|
Thanks for the follow-up @heftig, merged! |
|
FYI: the fix has been released with LLVM 21.1.4 yesterday |
Clang 21's cindex.py should be replacing all returned null cursors with
None. However, due to a bug (as of 21.1.3) it still returns null cursors fromtu.get_tokens.We need to explicitly check for these because cindex.py 21 also raises an exception when any property of a null cursor is accessed.