Conversation
|
/ok to test 85dbbb5 |
|
| mod_obj = ObjectCode.from_ptx(ptx, symbol_mapping=sym_map) | ||
| assert mod.code == ptx | ||
| if not Program._can_load_generated_ptx(): | ||
| if not Program.driver_can_load_nvrtc_ptx_output(): |
There was a problem hiding this comment.
This is worthy of discussion. I changed this private API into a public one after refactoring because it seems somewhat useful to an end user. An alternative would be to continue using the private API if the value here is too low.
There was a problem hiding this comment.
I'd definitely not make the API change in this PR, in part so that such a subtle change doesn't get drowned out by the many other changes here, but also to avoid entangled unintended side-effects from the disjoint changes (those tend to be hard to sort out, especially later, after this PR is merged already).
There was a problem hiding this comment.
But this simply renames a private function, making it public. The implementation remains the same. Do you think there could be unintended side-effects from that?
The questions I'm interested in are whether we expect users to end up with systems that can generate unloadable PTX, and if so, can they benefit from a function to check it?
There was a problem hiding this comment.
That's a good question, but in my experience it's usually far better to discuss separately. E.g. yesterday I already got derailed here looking through the very large diffs. I saw there were a bunch of test failures, then I searched the code and saw new code still referencing _can_load_generated_ptx, then decided to suggest it's best to not convolute things.
What makes it beneficial to tag on the change & discussion here?
There was a problem hiding this comment.
What makes it beneficial to tag on the change & discussion here?
CI and review latency, TBH. If changes were much cheaper I would break things down much more.
There was a problem hiding this comment.
I'm not the best to judge if this should stay private or become public. If public, I'd consider Program.nvrtc_generated_ptx_is_loadable() as the API.
- Rename _program.py to _program.pyx - Convert Program to cdef class with _program.pxd declarations - Extract _MembersNeededForFinalize to module-level _ProgramMNFF (nested classes not allowed in cdef class) - Add __repr__ method to Program - Keep ProgramOptions as @DataClass (unchanged) - Keep weakref.finalize pattern for handle cleanup
- Move _translate_program_options to Program_translate_options (cdef) - Move _can_load_generated_ptx to Program_can_load_generated_ptx (cdef) - Remove unused TYPE_CHECKING import block - Follow _memory/_buffer.pyx helper function patterns
- Reorganize file structure per developer guide (principal class first) - Add module docstring, __all__, type alias section - Factor long methods into cdef inline helpers - Add proper exception specs to cdef functions - Fix docstrings (use :class: refs, public paths) - Add type annotations to public methods - Inline _nvvm_exception_manager (single use) - Remove Union import, use | syntax - Add public Program.driver_can_load_nvrtc_ptx_output() API - Update tests to use new public API Closes NVIDIA#1082
Add fixtures for different Program backends (NVRTC, PTX, NVVM) and ObjectCode code types (cubin, PTX, LTOIR). Split API_TYPES into more precise HASH_TYPES, EQ_TYPES, and WEAKREF_TYPES lists. Derive DICT_KEY_TYPES and WEAK_KEY_TYPES for collection tests.
- Add NvrtcProgramHandle and NvvmProgramHandle to resource handles module - Add function pointer initialization for nvrtcDestroyProgram and nvvmDestroyProgram - Forward-declare nvvmProgram to avoid nvvm.h dependency - Refactor detail::make_py to accept module name parameter - Remove _ProgramMNFF class from _program.pyx - Program now uses typed handles directly with RAII cleanup - Update handle property to return None when handle is null
- Add NVVMError exception class - Add HANDLE_RETURN_NVRTC for nogil NVRTC error handling with program log - Add HANDLE_RETURN_NVVM for nogil NVVM error handling with program log - Remove vestigial supported_error_type fused type - Simplify HANDLE_RETURN to directly take cydriver.CUresult
85dbbb5 to
44fd98e
Compare
|
/ok to test 44fd98e |
| // Forward declaration for NVVM - avoids nvvm.h dependency | ||
| // Use void* to match cuda.bindings.cynvvm's typedef for compatibility | ||
| #ifndef CYTHON_EXTERN_C | ||
| typedef void *nvvmProgram; | ||
| #endif |
There was a problem hiding this comment.
Since nvvm is an optional dependency, I opted to define nvvmProgram this way rather than include the header.
44fd98e to
0f06313
Compare
|
/ok to test 0f06313 |
0f06313 to
5220d47
Compare
|
/ok to test 5220d47 |
5220d47 to
dbf2602
Compare
|
/ok to test dbf2602 |
- Change cdef function return types from ObjectCode to object (Cython limitation) - Remove unused imports: intptr_t, NvrtcProgramHandle, NvvmProgramHandle, as_intptr - Update as_py(NvvmProgramHandle) to return Python int via PyLong_FromSsize_t - Update test assertions: remove handle checks after close(), test idempotency instead - Update NVVM error message regex to match new unified format
dbf2602 to
754e295
Compare
|
/ok to test 754e295 |
| except KeyError: | ||
| name = name.encode() | ||
| if isinstance(name, str): | ||
| name = name.encode() |
There was a problem hiding this comment.
Slipped in this small fix. ObjectCode.get_kernel failed when given bytes. New test in test_module.py 107-8
Summary
Converts
_program.pyto Cython (_program.pyx) with full cythonization of NVRTC and NVVM paths, RAII-based resource management, andnogilerror handling.Changes
Cythonization
_program.py→_program.pyxwithcdef class Program_program.pxdwith typed attribute declarationsProgram_initcythonized with directcynvrtc/cynvvmcalls innogilblocksProgram_compile_nvrtcandProgram_compile_nvvmusenogilfor C API callscdef inlinehelpersResource Handles
NvrtcProgramHandleandNvvmProgramHandle(std::shared_ptr-based RAII)create_nvrtc_program_handle()andcreate_nvvm_program_handle()_MembersNeededForFinalize/ weakref.finalize patternas_py(NvvmProgramHandle)returns Pythonint(matches cuda_bindings design)Error Handling
HANDLE_RETURN_NVRTCfornogilNVRTC error handling with program logHANDLE_RETURN_NVVMfornogilNVVM error handling with program logNVVMErrorexception classHANDLE_RETURNto directly takecydriver.CUresultValidation & API
SUPPORTED_TARGETSmap for unified backend/target validationProgram.driver_can_load_nvrtc_ptx_output()APICode Quality
__all__, type alias section added:class:refs and public pathsUnionimport removed in favor of|syntaxstd::vector<const char*>instead ofmalloc/freeTests
test_object_protocols.pywithProgramandObjectCodevariationstest_program.py: testclose()idempotency instead of handle stateTest plan
test_program.pytests passtest_module.pytests passtest_object_protocols.pypasses with new variationspre-commitpassesCloses #1082