Required For application integration
UAIX.LmRuntime.LocalEndpoint
LocalEndpoint
High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.
LMRuntime.com / Package family
Choose the UAIX.LmRuntime package required for each task, inspect dependency boundaries, and open version-free installation guides.
UAIX.LmRuntime is a public family of 17 focused NuGet packages. Start with LocalEndpoint for application integration, or reference the package that owns the exact runtime layer you are implementing.
| Required For | Package | Owned layer | Public package |
|---|---|---|---|
| Application integration | |||
| application integration | LocalEndpoint |
High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries. | NuGet ↗ |
| Runtime execution pipeline | |||
| LLaMA graph/session internals | Models.Llama |
LLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence. | NuGet ↗ |
| tokenizer and chat-template work | Tokenization |
GGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools. | NuGet ↗ |
| GGUF inspection and validation | GGUF |
Bounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access. | NuGet ↗ |
| token selection and stop handling | Sampling |
Greedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control. | NuGet ↗ |
| managed CPU math | CPU Kernels |
Reference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks. | NuGet ↗ |
| tensor layout and storage metadata | Tensors |
Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math. | NuGet ↗ |
| runtime-neutral contracts | Abstractions |
Stable runtime contracts, request and response models, diagnostics, and governance interfaces. | NuGet ↗ |
| Backend selection and compatibility | |||
| explicit backend registration, probing, selection, and fallback evidence | Acceleration |
Portable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics. | NuGet ↗ |
| the explicit managed CPU backend and CPU fallback | Backends.CpuManaged |
Managed CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback. | NuGet ↗ |
| CUDA backend registration and compatibility diagnostics | Backends.Cuda |
CUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing. | NuGet ↗ |
| DirectML backend registration and compatibility diagnostics | Backends.DirectML |
DirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing. | NuGet ↗ |
| Vulkan backend registration and compatibility diagnostics | Backends.Vulkan |
Vulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing. | NuGet ↗ |
| ROCm backend registration and compatibility diagnostics | Backends.Rocm |
ROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing. | NuGet ↗ |
| Metal backend registration and compatibility diagnostics | Backends.Metal |
Metal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing. | NuGet ↗ |
| RID-specific native asset packages | |||
| the modern Windows x64 CUDA native-asset package slot | Backends.Cuda.Native.win-x64 |
RID-specific Windows x64 CUDA native-asset package slot for explicit modern CUDA deployment assets. | NuGet ↗ |
| the Tesla K80 Windows x64 legacy CUDA asset slot | Backends.Cuda.LegacyK80.win-x64 |
Separate Windows x64 legacy CUDA asset slot for Tesla K80 compute capability 3.7 compatibility. | NuGet ↗ |
AbstractionsTensorsGgufSamplingKernels.CpuTokenizationModels.LlamaAccelerationCpuManagedCUDADirectMLVulkanROCmMetalNative.win-x64LegacyK80.win-x64LocalEndpointdepends directly on Acceleration, CpuManaged, Models.Llama, and TokenizationThe maps show ownership and reading order. Every package guide lists exact direct dependencies and links each dependency to its guide and NuGet page.
Required For application integration
UAIX.LmRuntime.LocalEndpoint
High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.
Required For LLaMA graph/session internals
UAIX.LmRuntime.Models.Llama
LLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence.
Required For tokenizer and chat-template work
UAIX.LmRuntime.Tokenization
GGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools.
Required For GGUF inspection and validation
UAIX.LmRuntime.Gguf
Bounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access.
Required For token selection and stop handling
UAIX.LmRuntime.Sampling
Greedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control.
Required For managed CPU math
UAIX.LmRuntime.Kernels.Cpu
Reference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks.
Required For tensor layout and storage metadata
UAIX.LmRuntime.Tensors
Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math.
Required For runtime-neutral contracts
UAIX.LmRuntime.Abstractions
Stable runtime contracts, request and response models, diagnostics, and governance interfaces.
Required For explicit backend registration, probing, selection, and fallback evidence
UAIX.LmRuntime.Acceleration
Portable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics.
Required For the explicit managed CPU backend and CPU fallback
UAIX.LmRuntime.Backends.CpuManaged
Managed CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback.
Required For CUDA backend registration and compatibility diagnostics
UAIX.LmRuntime.Backends.Cuda
CUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For DirectML backend registration and compatibility diagnostics
UAIX.LmRuntime.Backends.DirectML
DirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For Vulkan backend registration and compatibility diagnostics
UAIX.LmRuntime.Backends.Vulkan
Vulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For ROCm backend registration and compatibility diagnostics
UAIX.LmRuntime.Backends.Rocm
ROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For Metal backend registration and compatibility diagnostics
UAIX.LmRuntime.Backends.Metal
Metal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For the modern Windows x64 CUDA native-asset package slot
UAIX.LmRuntime.Backends.Cuda.Native.win-x64
RID-specific Windows x64 CUDA native-asset package slot for explicit modern CUDA deployment assets.
Required For the Tesla K80 Windows x64 legacy CUDA asset slot
UAIX.LmRuntime.Backends.Cuda.LegacyK80.win-x64
Separate Windows x64 legacy CUDA asset slot for Tesla K80 compute capability 3.7 compatibility.
Public installation examples omit UAIX.LmRuntime version numbers so the site does not become a stale source of package pins. Production projects should select versions centrally, commit lock or resolved dependency files, and test upgrades in CI.
dotnet add package UAIX.LmRuntime.LocalEndpoint
<PackageReference Include="UAIX.LmRuntime.LocalEndpoint" />
Version policy: The documentation deliberately omits UAIX.LmRuntime package version numbers. Resolve and pin versions through your normal dependency-management and lock-file process.