LMRuntime.com / Guides

Documentation

Task-oriented guides for setup, runtime architecture, evidence, governance, security, and project terminology.

Package-first documentation for 17 UAIX.LmRuntime NuGet packages. Each package guide states purpose, exact dependencies, current execution status, install commands, workflow, examples, public links, FAQs, and API or asset boundaries.

Choose a packagePackage FamilyChoose the package that owns a task, inspect exact direct dependencies, and open its NuGet page.Run your first modelGetting StartedInstall LocalEndpoint, verify a local GGUF artifact, create an isolated session, and generate locally.Find a type or memberAPI ReferenceSearch public types and documented members across all managed packages.Understand package layersArchitectureTrace the model pipeline, backend-selection branch, direct dependencies, execution status, and host boundary.Check evidence boundariesRuntime Capability MatrixSeparate package API presence from model, platform, backend, performance, and deployment evidence.Check current statusProject StatusReview package publication, managed CPU execution, diagnostic accelerator boundaries, and claims not made.Evaluate claimsEvidence & BenchmarksUnderstand fixture evidence, parity records, reproducible benchmark requirements, and unsupported inferences.Harden an integrationSecurityReview local-file trust, hostile model input, path verification, resource limits, and disclosure routing.Decode terminologyGlossaryRead concise definitions for GGUF, tokenization, tensors, KV cache, sampling, backend probing, and UAIX context evidence.

Package guides

Application integration

Required For application integration

`LocalEndpoint`

High-level local-only GGUF facade with verified files, bounded loading, isolated sessions, managed CPU execution, and explicit backend capability boundaries.

Guide →NuGet ↗

Runtime execution pipeline

Required For LLaMA graph/session internals

`Models.Llama`

LLaMA-family configuration, tensor binding, mapped weight sources, reference forward execution, sessions, KV cache, generation, persistence, and parity evidence.

Guide →NuGet ↗

Required For tokenizer and chat-template work

`Tokenization`

GGUF tokenizer metadata, tokenizer factories and engines, special-token handling, chat templates, truncation, safety, and parity tools.

Guide →NuGet ↗

Required For GGUF inspection and validation

`GGUF`

Bounded GGUF parsing, metadata and tensor catalogs, strict validation, sharding, hashing, and mapped tensor access.

Guide →NuGet ↗

Required For token selection and stop handling

`Sampling`

Greedy and probability sampling, logit transforms, deterministic random state, top-k/top-p filtering, stop matching, and generation control.

Guide →NuGet ↗

Required For managed CPU math

`CPU Kernels`

Reference, portable-vector, AVX2-aware, half-precision, and quantized CPU kernels with explicit dispatch and parity checks.

Guide →NuGet ↗

Required For tensor layout and storage metadata

`Tensors`

Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math.

Guide →NuGet ↗

Required For runtime-neutral contracts

`Abstractions`

Stable runtime contracts, request and response models, diagnostics, and governance interfaces.

Guide →NuGet ↗

Backend selection and compatibility

Required For explicit backend registration, probing, selection, and fallback evidence

`Acceleration`

Portable backend contracts, explicit registration, device probing, deterministic selection, fallback reporting, and diagnostics.

Guide →NuGet ↗

Required For the explicit managed CPU backend and CPU fallback

`Backends.CpuManaged`

Managed CPU backend registration, always-local availability probing, capability evidence, and explicit CPU fallback.

Guide →NuGet ↗

Required For CUDA backend registration and compatibility diagnostics

`Backends.Cuda`

CUDA capability reporting, NVIDIA GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Guide →NuGet ↗

Required For DirectML backend registration and compatibility diagnostics

`Backends.DirectML`

DirectML capability reporting, Windows GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Guide →NuGet ↗

Required For Vulkan backend registration and compatibility diagnostics

`Backends.Vulkan`

Vulkan capability reporting, vendor-diverse GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Guide →NuGet ↗

Required For ROCm backend registration and compatibility diagnostics

`Backends.Rocm`

ROCm capability reporting, AMD GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Guide →NuGet ↗

Required For Metal backend registration and compatibility diagnostics

`Backends.Metal`

Metal capability reporting, Apple GPU compatibility diagnostics, explicit registration, and fail-closed probing.

Guide →NuGet ↗

RID-specific native asset packages

Required For the modern Windows x64 CUDA native-asset package slot

`Backends.Cuda.Native.win-x64`

RID-specific Windows x64 CUDA native-asset package slot for explicit modern CUDA deployment assets.

Guide →NuGet ↗

Required For the Tesla K80 Windows x64 legacy CUDA asset slot

`Backends.Cuda.LegacyK80.win-x64`

Separate Windows x64 legacy CUDA asset slot for Tesla K80 compute capability 3.7 compatibility.

Guide →NuGet ↗

Public NuGet indexSearch LMRuntime packages on NuGet ↗ · Publisher profile ↗