LMRuntime.com / Source versus evidence

Runtime Capability Matrix

See which capabilities have published package APIs, which artifact evidence is available, and which model, platform, integration, and host responsibilities remain separately scoped.

This matrix separates a published package API from model compatibility, platform evidence, performance, host-owned services, and production support. Read every row through that boundary.

How to read this matrix

A published API can be real and useful without supporting every model or environment. The matrix shows the strongest artifact-level signal, the public boundary, and the next consumer evidence gate instead of collapsing everything into a yes/no compatibility badge.

Published package API

The capability has a public type, contract, or implementation surface in the UAIX.LmRuntime NuGet package family.

Artifact documentation verified

The published package family exposes inspectable metadata, README guidance, and XML documentation for the represented API surface.

Integration evidence required

A consuming application must retain build, test, cancellation, disposal, and negative-case evidence for its resolved package graph and environment.

Model or platform scope required

A named model identity, tokenizer, tensor storage, workload, operating system, processor, and memory envelope are required before a compatibility or performance claim.

Host-owned or not provided

The behavior belongs to the consuming application or is intentionally outside this package family; do not infer it from the runtime API.

Package-family target

DistributionSeventeen public NuGet packages

Package publication does not create universal model, platform, or production-support guarantees.

Runtime formManaged local path with explicit backend selection contracts

The documented facade remains local and managed. Accelerator package registration does not imply native execution.

Model containerGGUF

Container validity, tokenizer support, model architecture, tensor storage, and kernel coverage remain separate checks.

Model execution pathLLaMA-family reference sessions

The family name is not a blanket claim for every derivative, metadata convention, quantization, or context size.

Required For application integrationUAIX.LmRuntime.LocalEndpoint

Lower-level packages remain available when an application needs direct ownership of parsing, tokens, tensors, kernels, sampling, or sessions.

Capability matrix

Package family and contracts

The runtime is published as layered packages so applications can take only the responsibilities they need.

Package family and contracts capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
Seventeen-package NuGet family	Published package API The managed model pipeline, Acceleration contracts, managed CPU backend, diagnostic GPU backend packages, Windows CUDA RID asset slots, and LocalEndpoint are represented by public package IDs and dedicated guides.	Package availability does not imply that every lower-level API is required by every application.	Choose the narrowest package and validate the resolved dependency graph in the consumer project.
Runtime-neutral inference contracts	Artifact documentation verified Requests, responses, messages, streaming deltas, sessions, tokenizers, runtime settings, diagnostics, and policy interfaces are documented in Abstractions.	An interface does not prove a particular implementation, model, latency, or availability commitment.	Contract tests for each concrete adapter and cancellation or streaming behavior used by the host.
Generated member reference	Artifact documentation verified LMRuntime.com publishes package/type/member reference data derived from public package XML documentation.	NuGet remains authoritative for current target frameworks, dependency versions, hashes, and release metadata.	Regenerate the reference whenever the published package API changes.

Model identity, GGUF intake, and tensor storage

Artifact identity and bounded structure are established before tokenizer selection or model execution.

Model identity, GGUF intake, and tensor storage capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
Bounded GGUF parsing and strict validation	Artifact documentation verified GGUF headers, metadata, arrays, tensor descriptors, alignment, sharding, classification, and configurable structural limits have documented APIs.	A structurally valid container can still use an unsupported tokenizer, architecture, storage type, or tensor arrangement.	Validate each accepted model artifact and preserve malformed or edge-case fixture coverage.
SHA-256, byte-count, and path verification	Artifact documentation verified The GGUF and LocalEndpoint layers expose hashing, expected length, allowed-root, reparse-point, and model-size controls.	The host still establishes the trusted digest, source receipt, model license, and accepted storage location.	Consumer tests for path escape, reparse points, mismatched identity, truncation, replacement, and limit failures.
Mapped, segmented, and copied tensor access	Artifact documentation verified Mapped-file ownership, tensor views, segmented readers, float32 readers, storage traits, and quantized block metadata are documented.	Views must respect owner lifetime, byte geometry, alignment, row shape, and exact storage-specific bounds.	Lifecycle, disposal, large-file, concurrency, and per-storage validation in declared environments.

Tokenization and LLaMA-family model binding

Tokenizer behavior and semantic tensor roles are derived from the model artifact and rejected explicitly when unsupported.

Tokenization and LLaMA-family model binding capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
GGUF-driven tokenizer construction	Artifact documentation verified Factories, strict metadata readers, multiple tokenizer engines, special-token options, chat rendering, streaming decode, truncation, and parity records are documented.	Vocabulary, merges, normalization, byte fallback, special tokens, and chat templates are model-specific.	Exact token-ID and decoded-byte parity for each accepted model and prompt format.
LLaMA configuration and tensor binding	Artifact documentation verified Configuration derivation, invariant checks, required tensor roles, binding manifests, mapped/reference weight sources, and diagnostics are documented.	A LLaMA-family label does not guarantee compatibility with every derivative or metadata/tensor naming convention.	Named artifact binding, forward, logits, and generation evidence with explicit limitations.
Broad model compatibility	Model or platform scope required The packages expose strict validation and diagnostics needed to determine support for a concrete artifact.	No universal list of all compatible repositories, quantizations, context variants, or fine-tunes is inferred.	Publish a model matrix only from retained immutable identities, test prompts, results, and known failures.

Managed CPU execution and sampling

Reference operations, observable CPU dispatch, model sessions, and token selection remain distinct responsibilities.

Managed CPU execution and sampling capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
Reference and dispatched CPU kernels	Artifact documentation verified Float, half-precision, quantized, normalization, matrix/vector, softmax, RoPE, dispatch-selection, and parity APIs are documented.	Represented storage and operations do not imply every shape, ISA tier, processor, or model path is supported end to end.	Operation-specific reference parity, invalid-input tests, selected-tier evidence, and model-level verification.
Greedy and probability sampling	Artifact documentation verified Greedy selection, seeded random state, temperature, top-k, top-p, minimum-p, repetition/frequency/presence transforms, bias, stops, and generation control are documented.	A seed cannot compensate for differences in model bytes, tokenizer, logits, floating-point execution, session state, or settings.	Golden vector tests, state replay, stop-boundary tests, and application-specific generation policies.
Performance and hardware support	Model or platform scope required CPU selection and benchmark-supporting diagnostics are represented.	No blanket latency, throughput, memory, energy, operating-system, processor, or driver claim follows from the API surface.	Reproducible benchmarks with exact artifact, package graph, build, machine, workload, warm-up, and aggregation details.

Backend registration, probing, and execution evidence

Backend identity, package compatibility, local availability, device proof, fallback, and model execution are kept as distinct facts.

Backend registration, probing, and execution evidence capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
Explicit backend registry and selector	Artifact documentation verified Acceleration documents backend contracts, registration, local probes, device descriptors, required and preferred policies, failure reasons, and recorded CPU fallback.	Selecting a backend identity does not execute a model or prove a native runtime, driver, device, or kernel path.	Retain probe output, selected device identity, adapter implementation, and model-level execution evidence.
Managed CPU backend	Artifact documentation verified Backends.CpuManaged registers a managed CPU backend and returns an available CPU device descriptor without native assets.	Availability of the backend descriptor does not establish performance or compatibility for every model and workload.	Validate the named artifact through the complete managed execution path in the consumer environment.
CUDA, DirectML, Vulkan, ROCm, and Metal package declarations	Model or platform scope required The backend packages publish stable IDs, API-family capability metadata, runtime identifiers, registration helpers, and fail-closed diagnostics.	The supplied diagnostic implementations probe unavailable and perform no hidden native inference.	Supply and test a host-proven native adapter with real assets, runtime, driver, device, and model execution proof.
Windows CUDA RID asset packages	Model or platform scope required Modern and Tesla K80 package IDs reserve runtimes/win-x64/native deployment paths and include marker assets.	A marker proves package layout only; the supplied packages do not embed CUDA inference DLLs.	Validate the complete required binary set, RID resolution, loadability, driver compatibility, device identity, and execution output.

Sessions and LocalEndpoint facade

The high-level package combines the supported local path while keeping host authority and application services outside the runtime.

Sessions and LocalEndpoint facade capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
Verified local model loading and isolated sessions	Artifact documentation verified Load options, execution limits, file expectations, model identity, session contexts, generation requests/results, token observation, and explicit disposal are documented.	The host chooses trusted artifacts, prepares prompts, maps errors, sets limits, and defines its supported usage envelope.	Consumer integration tests for independent sessions, reset/continuation, limits, cancellation, disposal, and failure mapping.
UAIX profile and memory context evidence	Artifact documentation verified Session-scoped profile/load-session and memory-route metadata are validated and retained as immutable evidence.	All authority flags remain false. Context does not grant execution, policy override, commands, network, provider access, telemetry, export, or website intake.	Keep import, protected-anchor enforcement, prompt assembly, persistence, and approval in the host application.
Downloader, server, provider client, subprocess, and telemetry	Host-owned or not provided The documented LocalEndpoint capability object reports the closed local-only boundary.	These services are not supplied or silently activated by the runtime facade.	Implement any required service explicitly in the host with independent security, consent, persistence, and operational controls.

Host-application responsibilities

Local inference is one component of an application. The application remains responsible for everything outside the package contract.

Host-application responsibilities capabilities and evidence gates
Capability	Current evidence signal	Public boundary	Next evidence gate
Model acquisition, licensing, and catalog	Host-owned or not provided The runtime accepts local artifacts and expected identity; it does not obtain them.	The host retains source receipts, licenses, immutable hashes, storage policy, and update approval.	Define an artifact intake and approval process before deployment.
Prompt assembly, memory, tools, commands, and policy	Host-owned or not provided The runtime consumes prepared prompt and evidence inputs and returns generation outputs.	No memory record or model output can silently grant host authority.	Implement explicit user approval, validation, sandboxing, review, and no-op behavior in the host.
Transport, persistence, audit, UI, and production operations	Host-owned or not provided The package exposes in-process runtime objects rather than an operational platform.	Authentication, authorization, multi-user isolation, data retention, logging, uptime, deployment, and support are host concerns.	Apply the consuming organization’s architecture, security, privacy, accessibility, reliability, and support controls.

Tensor storage vocabulary represented by the package

Show represented storage identifiers

These identifiers are present in the public tensor-storage vocabulary. They do not establish that every operation, tensor shape, model architecture, or CPU path supports each type end to end.

F32
F16
F64
BF16
I8
I16
I32
I64
Q4_0
Q4_1
Q5_0
Q5_1
Q8_0
Q8_1
Q2_K
Q3_K
Q4_K
Q5_K
Q6_K
Q8_K
IQ4_NL

Capability FAQ

Are the UAIX.LmRuntime packages publicly available?

Yes. The package family is published on NuGet, and every package guide links to its version-independent package URL.

Does package publication mean every GGUF model works?

No. Container validity, tokenizer metadata, LLaMA-family graph requirements, tensor storage, kernel coverage, context size, and resource limits must all match the concrete artifact.

Which package should an application install first?

Use UAIX.LmRuntime.LocalEndpoint for the bounded application-facing local GGUF path. Install a lower-level package directly when the application needs to own that layer.

Do the CUDA, DirectML, Vulkan, ROCm, and Metal packages execute models today?

The supplied backend packages register capability and compatibility diagnostics, but their diagnostic implementations probe unavailable. Native execution requires separate host-proven assets, runtime, driver, device, adapter, and model evidence.

Does LocalEndpoint download models or call cloud APIs?

No. The documented facade accepts local model paths and expected identities. It does not provide a downloader, provider client, network fallback, server, subprocess launcher, or telemetry exporter.

Why are package version numbers absent from this site?

The documentation is intended to remain useful across package updates. NuGet is authoritative for current versions and frameworks; consumers should pin resolved dependencies and test upgrades in CI.

Does UAIX memory or profile context grant execution authority?

No. The context is immutable session evidence only. Commands, network, provider APIs, telemetry, export, policy override, and runtime authority remain false and host-controlled.

Build a local integrationInstall the facade, verify a local artifact, and exercise its failure boundaries. Review the evidence postureSee what a publishable parity or benchmark record requires. Check project statusReview current package, documentation, and claim boundaries. Propose a capability or correctionUse a bounded, evidence-backed idea packet.