model-runtime

Star

Here are 2 public repositories matching this topic...

hilum-labs / hilum-local-llm-engine

Star

Open source C/C++ engine for high-performance local LLM inference and on-device AI.

c ai cpp inference-engine llm model-runtime local-ai private-ai on-device-inference

Updated Mar 14, 2026
C++

tk-yasuno / gpt-oss-20b-local-execute

Star

GPT-OSS B20 Local Execution. Lightweight local environment for running it with Python 3.12 and CUDA acceleration. - Run GPT-OSS B20 entirely offline - Optimize text generation with GPU - Enable fast, secure inference on consumer hardware.

text-generation performance-optimization gpu-optimization edge-ai inference-acceleration secure-inference model-runtime minimal-setup llm-inference open-source-llm local-execution offline-inference privacy-preserving-ai consumer-gpu gpt-oss-b20 lightweight-environment

Updated Aug 13, 2025
Python

Improve this page

Add a description, image, and links to the model-runtime topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the model-runtime topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model-runtime

Here are 2 public repositories matching this topic...

hilum-labs / hilum-local-llm-engine

tk-yasuno / gpt-oss-20b-local-execute

Improve this page

Add this topic to your repo