Skip to content

[FEA]: cuda.core JIT cache #1785

@rparolin

Description

@rparolin

Is this a duplicate?

Area

cuda.core

Is your feature request related to a problem? Please describe.

cuda.core currently pays repeated JIT compilation costs for kernels which makes cold-start latency much higher than it needs to be and hurts iterative workflows. CuPy has shown that persistent JIT caching can reduce this overhead by caching compiled kernels for reuse across runs.

Describe the solution you'd like

Add a persistent JIT cache to cuda.core for compiled artifacts and any expensive intermediate/header-processing steps involved in runtime compilation. The cache should key entries on all inputs that affect correctness, such as source code, compilation options, target architecture, toolkit/runtime version, and relevant dependency/header content, so cached artifacts are safely reused only when valid. The JIT cache should support logging/telemetry so users can tell if they are hitting or missing the cache. In addition, the JIT cache should allow users to configure cache location and cache size.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

P0High priority - Must do!cuda.coreEverything related to the cuda.core module

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions