Required For application integration
LocalEndpoint
High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.
LMRuntime.com / Guides
Task-oriented guides for setup, runtime architecture, evidence, governance, security, and project terminology.
Package-first documentation for 17 UAIX.LmRuntime NuGet packages. Each package guide states purpose, exact dependencies, current execution status, install commands, workflow, examples, public links, FAQs, and API or asset boundaries.
Required For application integration
LocalEndpointHigh-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.
Required For LLaMA graph/session internals
Models.LlamaLLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence.
Required For tokenizer and chat-template work
TokenizationGGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools.
Required For GGUF inspection and validation
GGUFBounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access.
Required For token selection and stop handling
SamplingGreedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control.
Required For managed CPU math
CPU KernelsReference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks.
Required For tensor layout and storage metadata
TensorsTensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math.
Required For runtime-neutral contracts
AbstractionsStable runtime contracts, request and response models, diagnostics, and governance interfaces.
Required For explicit backend registration, probing, selection, and fallback evidence
AccelerationPortable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics.
Required For the explicit managed CPU backend and CPU fallback
Backends.CpuManagedManaged CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback.
Required For CUDA backend registration and compatibility diagnostics
Backends.CudaCUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For DirectML backend registration and compatibility diagnostics
Backends.DirectMLDirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For Vulkan backend registration and compatibility diagnostics
Backends.VulkanVulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For ROCm backend registration and compatibility diagnostics
Backends.RocmROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For Metal backend registration and compatibility diagnostics
Backends.MetalMetal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing.
Required For the modern Windows x64 CUDA native-asset package slot
Backends.Cuda.Native.win-x64RID-specific Windows x64 CUDA native-asset package slot for explicit modern CUDA deployment assets.
Required For the Tesla K80 Windows x64 legacy CUDA asset slot
Backends.Cuda.LegacyK80.win-x64Separate Windows x64 legacy CUDA asset slot for Tesla K80 compute capability 3.7 compatibility.