LMRuntime.com / Package family

UAIX.LmRuntime Packages

Choose the UAIX.LmRuntime package required for each task, inspect dependency boundaries, and open version-free installation guides.

UAIX.LmRuntime is a public family of 17 focused NuGet packages. Start with LocalEndpoint for application integration, or reference the package that owns the exact runtime layer you are implementing.

Choose The Package Required For Your Task

Required ForPackageOwned layerPublic package
Application integration
application integration LocalEndpoint High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries. NuGet ↗
Runtime execution pipeline
LLaMA graph/session internals Models.Llama LLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence. NuGet ↗
tokenizer and chat-template work Tokenization GGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools. NuGet ↗
GGUF inspection and validation GGUF Bounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access. NuGet ↗
token selection and stop handling Sampling Greedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control. NuGet ↗
managed CPU math CPU Kernels Reference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks. NuGet ↗
tensor layout and storage metadata Tensors Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math. NuGet ↗
runtime-neutral contracts Abstractions Stable runtime contracts, request and response models, diagnostics, and governance interfaces. NuGet ↗
Backend selection and compatibility
explicit backend registration, probing, selection, and fallback evidence Acceleration Portable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics. NuGet ↗
the explicit managed CPU backend and CPU fallback Backends.CpuManaged Managed CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback. NuGet ↗
CUDA backend registration and compatibility diagnostics Backends.Cuda CUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing. NuGet ↗
DirectML backend registration and compatibility diagnostics Backends.DirectML DirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing. NuGet ↗
Vulkan backend registration and compatibility diagnostics Backends.Vulkan Vulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing. NuGet ↗
ROCm backend registration and compatibility diagnostics Backends.Rocm ROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing. NuGet ↗
Metal backend registration and compatibility diagnostics Backends.Metal Metal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing. NuGet ↗
RID-specific native asset packages
the modern Windows x64 CUDA native-asset package slot Backends.Cuda.Native.win-x64 RID-specific Windows x64 CUDA native-asset package slot for explicit modern CUDA deployment assets. NuGet ↗
the Tesla K80 Windows x64 legacy CUDA asset slot Backends.Cuda.LegacyK80.win-x64 Separate Windows x64 legacy CUDA asset slot for Tesla K80 compute capability 3.7 compatibility. NuGet ↗

Backend execution status

Fail-closed package boundary Backends.CpuManaged probes available for the managed CPU lane. CUDA, DirectML, Vulkan, ROCm, and Metal packages register explicit capability metadata and diagnostics, but their supplied implementations probe unavailable until a host-proven native adapter supplies assets, runtime, driver, and device evidence. The Windows CUDA native-asset packages contain marker files rather than inference binaries.

Dependency map

Model execution pipeline

Foundations
AbstractionsTensors
Container, selection, compute
GgufSamplingKernels.Cpu
Text boundary
Tokenization
LLaMA execution
Models.Llama

Backend selection and deployment

Contracts and selector
Acceleration
Registered backends
CpuManagedCUDADirectMLVulkanROCmMetal
Windows CUDA RID slots
Native.win-x64LegacyK80.win-x64
Application facade
LocalEndpointdepends directly on Acceleration, CpuManaged, Models.Llama, and Tokenization

The maps show ownership and reading order. Every package guide lists exact direct dependencies and links each dependency to its guide and NuGet page.

Package catalog

Application integration

Required For application integration

UAIX.LmRuntime.LocalEndpoint

LocalEndpoint

High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.

Runtime execution pipeline

Required For LLaMA graph/session internals

UAIX.LmRuntime.Models.Llama

Models.Llama

LLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence.

Required For tokenizer and chat-template work

UAIX.LmRuntime.Tokenization

Tokenization

GGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools.

Required For GGUF inspection and validation

UAIX.LmRuntime.Gguf

GGUF

Bounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access.

Required For token selection and stop handling

UAIX.LmRuntime.Sampling

Sampling

Greedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control.

Required For managed CPU math

UAIX.LmRuntime.Kernels.Cpu

CPU Kernels

Reference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks.

Required For tensor layout and storage metadata

UAIX.LmRuntime.Tensors

Tensors

Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math.

Backend selection and compatibility

Required For explicit backend registration, probing, selection, and fallback evidence

UAIX.LmRuntime.Acceleration

Acceleration

Portable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics.

Required For the explicit managed CPU backend and CPU fallback

UAIX.LmRuntime.Backends.CpuManaged

Backends.CpuManaged

Managed CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback.

Required For CUDA backend registration and compatibility diagnostics

UAIX.LmRuntime.Backends.Cuda

Backends.Cuda

CUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For DirectML backend registration and compatibility diagnostics

UAIX.LmRuntime.Backends.DirectML

Backends.DirectML

DirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For Vulkan backend registration and compatibility diagnostics

UAIX.LmRuntime.Backends.Vulkan

Backends.Vulkan

Vulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For ROCm backend registration and compatibility diagnostics

UAIX.LmRuntime.Backends.Rocm

Backends.Rocm

ROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For Metal backend registration and compatibility diagnostics

UAIX.LmRuntime.Backends.Metal

Backends.Metal

Metal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing.

RID-specific native asset packages

Version-free installation policy

Public installation examples omit UAIX.LmRuntime version numbers so the site does not become a stale source of package pins. Production projects should select versions centrally, commit lock or resolved dependency files, and test upgrades in CI.

.NET CLI
dotnet add package UAIX.LmRuntime.LocalEndpoint
Project file
<PackageReference Include="UAIX.LmRuntime.LocalEndpoint" />

Version policy: The documentation deliberately omits UAIX.LmRuntime package version numbers. Resolve and pin versions through your normal dependency-management and lock-file process.

What the package family does not do

No hidden cloud pathLocalEndpoint is local and in-process; it is not a provider API client or network fallback.
No model downloaderApplications acquire, license, locate, trust, and catalog model artifacts.
No implied GPU executionA backend package, capability declaration, or RID marker is not proof that native inference executed.
No server ownershipHTTP, worker transport, registries, audit, UI, and persistence stay with the host application.