LMRuntime.com / Source versus evidence
Runtime Capability Matrix
See which capabilities have published package APIs, which artifact evidence is available, and which model, platform, integration, and host responsibilities remain separately scoped.
This matrix separates a published package API from model compatibility, platform evidence, performance, host-owned services, and production support. Read every row through that boundary.
How to read this matrix
A published API can be real and useful without supporting every model or environment. The matrix shows the strongest artifact-level signal, the public boundary, and the next consumer evidence gate instead of collapsing everything into a yes/no compatibility badge.
The capability has a public type, contract, or implementation surface in the UAIX.LmRuntime NuGet package family.
The published package family exposes inspectable metadata, README guidance, and XML documentation for the represented API surface.
A consuming application must retain build, test, cancellation, disposal, and negative-case evidence for its resolved package graph and environment.
A named model identity, tokenizer, tensor storage, workload, operating system, processor, and memory envelope are required before a compatibility or performance claim.
The behavior belongs to the consuming application or is intentionally outside this package family; do not infer it from the runtime API.
Package-family target
Package publication does not create universal model, platform, or production-support guarantees.
The documented facade remains local and managed. Accelerator package registration does not imply native execution.
Container validity, tokenizer support, model architecture, tensor storage, and kernel coverage remain separate checks.
The family name is not a blanket claim for every derivative, metadata convention, quantization, or context size.
Lower-level packages remain available when an application needs direct ownership of parsing, tokens, tensors, kernels, sampling, or sessions.
Capability matrix
Package family and contracts
The runtime is published as layered packages so applications can take only the responsibilities they need.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| Seventeen-package NuGet family | Published package API The managed model pipeline, Acceleration contracts, managed CPU backend, diagnostic GPU backend packages, Windows CUDA RID asset slots, and LocalEndpoint are represented by public package IDs and dedicated guides. | Package availability does not imply that every lower-level API is required by every application. | Choose the narrowest package and validate the resolved dependency graph in the consumer project. |
| Runtime-neutral inference contracts | Artifact documentation verified Requests, responses, messages, streaming deltas, sessions, tokenizers, runtime settings, diagnostics, and policy interfaces are documented in Abstractions. | An interface does not prove a particular implementation, model, latency, or availability commitment. | Contract tests for each concrete adapter and cancellation or streaming behavior used by the host. |
| Generated member reference | Artifact documentation verified LMRuntime.com publishes package/type/member reference data derived from public package XML documentation. | NuGet remains authoritative for current target frameworks, dependency versions, hashes, and release metadata. | Regenerate the reference whenever the published package API changes. |
Model identity, GGUF intake, and tensor storage
Artifact identity and bounded structure are established before tokenizer selection or model execution.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| Bounded GGUF parsing and strict validation | Artifact documentation verified GGUF headers, metadata, arrays, tensor descriptors, alignment, sharding, classification, and configurable structural limits have documented APIs. | A structurally valid container can still use an unsupported tokenizer, architecture, storage type, or tensor arrangement. | Validate each accepted model artifact and preserve malformed or edge-case fixture coverage. |
| SHA-256, byte-count, and path verification | Artifact documentation verified The GGUF and LocalEndpoint layers expose hashing, expected length, allowed-root, reparse-point, and model-size controls. | The host still establishes the trusted digest, source receipt, model license, and accepted storage location. | Consumer tests for path escape, reparse points, mismatched identity, truncation, replacement, and limit failures. |
| Mapped, segmented, and copied tensor access | Artifact documentation verified Mapped-file ownership, tensor views, segmented readers, float32 readers, storage traits, and quantized block metadata are documented. | Views must respect owner lifetime, byte geometry, alignment, row shape, and exact storage-specific bounds. | Lifecycle, disposal, large-file, concurrency, and per-storage validation in declared environments. |
Tokenization and LLaMA-family model binding
Tokenizer behavior and semantic tensor roles are derived from the model artifact and rejected explicitly when unsupported.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| GGUF-driven tokenizer construction | Artifact documentation verified Factories, strict metadata readers, multiple tokenizer engines, special-token options, chat rendering, streaming decode, truncation, and parity records are documented. | Vocabulary, merges, normalization, byte fallback, special tokens, and chat templates are model-specific. | Exact token-ID and decoded-byte parity for each accepted model and prompt format. |
| LLaMA configuration and tensor binding | Artifact documentation verified Configuration derivation, invariant checks, required tensor roles, binding manifests, mapped/reference weight sources, and diagnostics are documented. | A LLaMA-family label does not guarantee compatibility with every derivative or metadata/tensor naming convention. | Named artifact binding, forward, logits, and generation evidence with explicit limitations. |
| Broad model compatibility | Model or platform scope required The packages expose strict validation and diagnostics needed to determine support for a concrete artifact. | No universal list of all compatible repositories, quantizations, context variants, or fine-tunes is inferred. | Publish a model matrix only from retained immutable identities, test prompts, results, and known failures. |
Managed CPU execution and sampling
Reference operations, observable CPU dispatch, model sessions, and token selection remain distinct responsibilities.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| Reference and dispatched CPU kernels | Artifact documentation verified Float, half-precision, quantized, normalization, matrix/vector, softmax, RoPE, dispatch-selection, and parity APIs are documented. | Represented storage and operations do not imply every shape, ISA tier, processor, or model path is supported end to end. | Operation-specific reference parity, invalid-input tests, selected-tier evidence, and model-level verification. |
| Greedy and probability sampling | Artifact documentation verified Greedy selection, seeded random state, temperature, top-k, top-p, minimum-p, repetition/frequency/presence transforms, bias, stops, and generation control are documented. | A seed cannot compensate for differences in model bytes, tokenizer, logits, floating-point execution, session state, or settings. | Golden vector tests, state replay, stop-boundary tests, and application-specific generation policies. |
| Performance and hardware support | Model or platform scope required CPU selection and benchmark-supporting diagnostics are represented. | No blanket latency, throughput, memory, energy, operating-system, processor, or driver claim follows from the API surface. | Reproducible benchmarks with exact artifact, package graph, build, machine, workload, warm-up, and aggregation details. |
Backend registration, probing, and execution evidence
Backend identity, package compatibility, local availability, device proof, fallback, and model execution are kept as distinct facts.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| Explicit backend registry and selector | Artifact documentation verified Acceleration documents backend contracts, registration, local probes, device descriptors, required and preferred policies, failure reasons, and recorded CPU fallback. | Selecting a backend identity does not execute a model or prove a native runtime, driver, device, or kernel path. | Retain probe output, selected device identity, adapter implementation, and model-level execution evidence. |
| Managed CPU backend | Artifact documentation verified Backends.CpuManaged registers a managed CPU backend and returns an available CPU device descriptor without native assets. | Availability of the backend descriptor does not establish performance or compatibility for every model and workload. | Validate the named artifact through the complete managed execution path in the consumer environment. |
| CUDA, DirectML, Vulkan, ROCm, and Metal package declarations | Model or platform scope required The backend packages publish stable IDs, API-family capability metadata, runtime identifiers, registration helpers, and fail-closed diagnostics. | The supplied diagnostic implementations probe unavailable and perform no hidden native inference. | Supply and test a host-proven native adapter with real assets, runtime, driver, device, and model execution proof. |
| Windows CUDA RID asset packages | Model or platform scope required Modern and Tesla K80 package IDs reserve runtimes/win-x64/native deployment paths and include marker assets. | A marker proves package layout only; the supplied packages do not embed CUDA inference DLLs. | Validate the complete required binary set, RID resolution, loadability, driver compatibility, device identity, and execution output. |
Sessions and LocalEndpoint facade
The high-level package combines the supported local path while keeping host authority and application services outside the runtime.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| Verified local model loading and isolated sessions | Artifact documentation verified Load options, execution limits, file expectations, model identity, session contexts, generation requests/results, token observation, and explicit disposal are documented. | The host chooses trusted artifacts, prepares prompts, maps errors, sets limits, and defines its supported usage envelope. | Consumer integration tests for independent sessions, reset/continuation, limits, cancellation, disposal, and failure mapping. |
| UAIX profile and memory context evidence | Artifact documentation verified Session-scoped profile/load-session and memory-route metadata are validated and retained as immutable evidence. | All authority flags remain false. Context does not grant execution, policy override, commands, network, provider access, telemetry, export, or website intake. | Keep import, protected-anchor enforcement, prompt assembly, persistence, and approval in the host application. |
| Downloader, server, provider client, subprocess, and telemetry | Host-owned or not provided The documented LocalEndpoint capability object reports the closed local-only boundary. | These services are not supplied or silently activated by the runtime facade. | Implement any required service explicitly in the host with independent security, consent, persistence, and operational controls. |
Host-application responsibilities
Local inference is one component of an application. The application remains responsible for everything outside the package contract.
| Capability | Current evidence signal | Public boundary | Next evidence gate |
|---|---|---|---|
| Model acquisition, licensing, and catalog | Host-owned or not provided The runtime accepts local artifacts and expected identity; it does not obtain them. | The host retains source receipts, licenses, immutable hashes, storage policy, and update approval. | Define an artifact intake and approval process before deployment. |
| Prompt assembly, memory, tools, commands, and policy | Host-owned or not provided The runtime consumes prepared prompt and evidence inputs and returns generation outputs. | No memory record or model output can silently grant host authority. | Implement explicit user approval, validation, sandboxing, review, and no-op behavior in the host. |
| Transport, persistence, audit, UI, and production operations | Host-owned or not provided The package exposes in-process runtime objects rather than an operational platform. | Authentication, authorization, multi-user isolation, data retention, logging, uptime, deployment, and support are host concerns. | Apply the consuming organization’s architecture, security, privacy, accessibility, reliability, and support controls. |
Tensor storage vocabulary represented by the package
Show represented storage identifiers
These identifiers are present in the public tensor-storage vocabulary. They do not establish that every operation, tensor shape, model architecture, or CPU path supports each type end to end.
F32F16F64BF16I8I16I32I64Q4_0Q4_1Q5_0Q5_1Q8_0Q8_1Q2_KQ3_KQ4_KQ5_KQ6_KQ8_KIQ4_NL
Capability FAQ
Are the UAIX.LmRuntime packages publicly available?
Yes. The package family is published on NuGet, and every package guide links to its version-independent package URL.
Does package publication mean every GGUF model works?
No. Container validity, tokenizer metadata, LLaMA-family graph requirements, tensor storage, kernel coverage, context size, and resource limits must all match the concrete artifact.
Which package should an application install first?
Use UAIX.LmRuntime.LocalEndpoint for the bounded application-facing local GGUF path. Install a lower-level package directly when the application needs to own that layer.
Do the CUDA, DirectML, Vulkan, ROCm, and Metal packages execute models today?
The supplied backend packages register capability and compatibility diagnostics, but their diagnostic implementations probe unavailable. Native execution requires separate host-proven assets, runtime, driver, device, adapter, and model evidence.
Does LocalEndpoint download models or call cloud APIs?
No. The documented facade accepts local model paths and expected identities. It does not provide a downloader, provider client, network fallback, server, subprocess launcher, or telemetry exporter.
Why are package version numbers absent from this site?
The documentation is intended to remain useful across package updates. NuGet is authoritative for current versions and frameworks; consumers should pin resolved dependencies and test upgrades in CI.
Does UAIX memory or profile context grant execution authority?
No. The context is immutable session evidence only. Commands, network, provider APIs, telemetry, export, policy override, and runtime authority remain false and host-controlled.
