LMRuntime.com / Guides

Documentation

Task-oriented guides for setup, runtime architecture, evidence, governance, security, and project terminology.

Package-first documentation for 17 UAIX.LmRuntime NuGet packages. Each package guide states purpose, exact dependencies, current execution status, install commands, workflow, examples, public links, FAQs, and API or asset boundaries.

Package guides

Application integration

Required For application integration

LocalEndpoint

High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.

Runtime execution pipeline

Required For LLaMA graph/session internals

Models.Llama

LLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence.

Required For tokenizer and chat-template work

Tokenization

GGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools.

Required For GGUF inspection and validation

GGUF

Bounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access.

Required For token selection and stop handling

Sampling

Greedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control.

Required For managed CPU math

CPU Kernels

Reference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks.

Required For tensor layout and storage metadata

Tensors

Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math.

Required For runtime-neutral contracts

Abstractions

Stable runtime contracts, request and response models, diagnostics, and governance interfaces.

Backend selection and compatibility

Required For explicit backend registration, probing, selection, and fallback evidence

Acceleration

Portable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics.

Required For the explicit managed CPU backend and CPU fallback

Backends.CpuManaged

Managed CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback.

Required For CUDA backend registration and compatibility diagnostics

Backends.Cuda

CUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For DirectML backend registration and compatibility diagnostics

Backends.DirectML

DirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For Vulkan backend registration and compatibility diagnostics

Backends.Vulkan

Vulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For ROCm backend registration and compatibility diagnostics

Backends.Rocm

ROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Required For Metal backend registration and compatibility diagnostics

Backends.Metal

Metal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing.

RID-specific native asset packages