LMRuntime.com / Source versus evidence

Runtime Capability Matrix

See which capabilities have published package APIs, which artifact evidence is available, and which model, platform, integration, and host responsibilities remain separately scoped.

This matrix separates a published package API from model compatibility, platform evidence, performance, host-owned services, and production support. Read every row through that boundary.

How to read this matrix

A published API can be real and useful without supporting every model or environment. The matrix shows the strongest artifact-level signal, the public boundary, and the next consumer evidence gate instead of collapsing everything into a yes/no compatibility badge.

Published package API

The capability has a public type, contract, or implementation surface in the UAIX.LmRuntime NuGet package family.

Artifact documentation verified

The published package family exposes inspectable metadata, README guidance, and XML documentation for the represented API surface.

Integration evidence required

A consuming application must retain build, test, cancellation, disposal, and negative-case evidence for its resolved package graph and environment.

Model or platform scope required

A named model identity, tokenizer, tensor storage, workload, operating system, processor, and memory envelope are required before a compatibility or performance claim.

Host-owned or not provided

The behavior belongs to the consuming application or is intentionally outside this package family; do not infer it from the runtime API.

Package-family target

DistributionSeventeen public NuGet packages

Package publication does not create universal model, platform, or production-support guarantees.

Runtime formManaged local path with explicit backend selection contracts

The documented facade remains local and managed. Accelerator package registration does not imply native execution.

Model containerGGUF

Container validity, tokenizer support, model architecture, tensor storage, and kernel coverage remain separate checks.

Model execution pathLLaMA-family reference sessions

The family name is not a blanket claim for every derivative, metadata convention, quantization, or context size.

Required For application integrationUAIX.LmRuntime.LocalEndpoint

Lower-level packages remain available when an application needs direct ownership of parsing, tokens, tensors, kernels, sampling, or sessions.

Capability matrix

Package family and contracts

The runtime is published as layered packages so applications can take only the responsibilities they need.

Package family and contracts capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
Seventeen-package NuGet familyPublished package API

The managed model pipeline, Acceleration contracts, managed CPU backend, diagnostic GPU backend packages, Windows CUDA RID asset slots, and LocalEndpoint are represented by public package IDs and dedicated guides.

Package availability does not imply that every lower-level API is required by every application.Choose the narrowest package and validate the resolved dependency graph in the consumer project.
Runtime-neutral inference contractsArtifact documentation verified

Requests, responses, messages, streaming deltas, sessions, tokenizers, runtime settings, diagnostics, and policy interfaces are documented in Abstractions.

An interface does not prove a particular implementation, model, latency, or availability commitment.Contract tests for each concrete adapter and cancellation or streaming behavior used by the host.
Generated member referenceArtifact documentation verified

LMRuntime.com publishes package/type/member reference data derived from public package XML documentation.

NuGet remains authoritative for current target frameworks, dependency versions, hashes, and release metadata.Regenerate the reference whenever the published package API changes.

Model identity, GGUF intake, and tensor storage

Artifact identity and bounded structure are established before tokenizer selection or model execution.

Model identity, GGUF intake, and tensor storage capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
Bounded GGUF parsing and strict validationArtifact documentation verified

GGUF headers, metadata, arrays, tensor descriptors, alignment, sharding, classification, and configurable structural limits have documented APIs.

A structurally valid container can still use an unsupported tokenizer, architecture, storage type, or tensor arrangement.Validate each accepted model artifact and preserve malformed or edge-case fixture coverage.
SHA-256, byte-count, and path verificationArtifact documentation verified

The GGUF and LocalEndpoint layers expose hashing, expected length, allowed-root, reparse-point, and model-size controls.

The host still establishes the trusted digest, source receipt, model license, and accepted storage location.Consumer tests for path escape, reparse points, mismatched identity, truncation, replacement, and limit failures.
Mapped, segmented, and copied tensor accessArtifact documentation verified

Mapped-file ownership, tensor views, segmented readers, float32 readers, storage traits, and quantized block metadata are documented.

Views must respect owner lifetime, byte geometry, alignment, row shape, and exact storage-specific bounds.Lifecycle, disposal, large-file, concurrency, and per-storage validation in declared environments.

Tokenization and LLaMA-family model binding

Tokenizer behavior and semantic tensor roles are derived from the model artifact and rejected explicitly when unsupported.

Tokenization and LLaMA-family model binding capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
GGUF-driven tokenizer constructionArtifact documentation verified

Factories, strict metadata readers, multiple tokenizer engines, special-token options, chat rendering, streaming decode, truncation, and parity records are documented.

Vocabulary, merges, normalization, byte fallback, special tokens, and chat templates are model-specific.Exact token-ID and decoded-byte parity for each accepted model and prompt format.
LLaMA configuration and tensor bindingArtifact documentation verified

Configuration derivation, invariant checks, required tensor roles, binding manifests, mapped/reference weight sources, and diagnostics are documented.

A LLaMA-family label does not guarantee compatibility with every derivative or metadata/tensor naming convention.Named artifact binding, forward, logits, and generation evidence with explicit limitations.
Broad model compatibilityModel or platform scope required

The packages expose strict validation and diagnostics needed to determine support for a concrete artifact.

No universal list of all compatible repositories, quantizations, context variants, or fine-tunes is inferred.Publish a model matrix only from retained immutable identities, test prompts, results, and known failures.

Managed CPU execution and sampling

Reference operations, observable CPU dispatch, model sessions, and token selection remain distinct responsibilities.

Managed CPU execution and sampling capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
Reference and dispatched CPU kernelsArtifact documentation verified

Float, half-precision, quantized, normalization, matrix/vector, softmax, RoPE, dispatch-selection, and parity APIs are documented.

Represented storage and operations do not imply every shape, ISA tier, processor, or model path is supported end to end.Operation-specific reference parity, invalid-input tests, selected-tier evidence, and model-level verification.
Greedy and probability samplingArtifact documentation verified

Greedy selection, seeded random state, temperature, top-k, top-p, minimum-p, repetition/frequency/presence transforms, bias, stops, and generation control are documented.

A seed cannot compensate for differences in model bytes, tokenizer, logits, floating-point execution, session state, or settings.Golden vector tests, state replay, stop-boundary tests, and application-specific generation policies.
Performance and hardware supportModel or platform scope required

CPU selection and benchmark-supporting diagnostics are represented.

No blanket latency, throughput, memory, energy, operating-system, processor, or driver claim follows from the API surface.Reproducible benchmarks with exact artifact, package graph, build, machine, workload, warm-up, and aggregation details.

Backend registration, probing, and execution evidence

Backend identity, package compatibility, local availability, device proof, fallback, and model execution are kept as distinct facts.

Backend registration, probing, and execution evidence capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
Explicit backend registry and selectorArtifact documentation verified

Acceleration documents backend contracts, registration, local probes, device descriptors, required and preferred policies, failure reasons, and recorded CPU fallback.

Selecting a backend identity does not execute a model or prove a native runtime, driver, device, or kernel path.Retain probe output, selected device identity, adapter implementation, and model-level execution evidence.
Managed CPU backendArtifact documentation verified

Backends.CpuManaged registers a managed CPU backend and returns an available CPU device descriptor without native assets.

Availability of the backend descriptor does not establish performance or compatibility for every model and workload.Validate the named artifact through the complete managed execution path in the consumer environment.
CUDA, DirectML, Vulkan, ROCm, and Metal package declarationsModel or platform scope required

The backend packages publish stable IDs, API-family capability metadata, runtime identifiers, registration helpers, and fail-closed diagnostics.

The supplied diagnostic implementations probe unavailable and perform no hidden native inference.Supply and test a host-proven native adapter with real assets, runtime, driver, device, and model execution proof.
Windows CUDA RID asset packagesModel or platform scope required

Modern and Tesla K80 package IDs reserve runtimes/win-x64/native deployment paths and include marker assets.

A marker proves package layout only; the supplied packages do not embed CUDA inference DLLs.Validate the complete required binary set, RID resolution, loadability, driver compatibility, device identity, and execution output.

Sessions and LocalEndpoint facade

The high-level package combines the supported local path while keeping host authority and application services outside the runtime.

Sessions and LocalEndpoint facade capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
Verified local model loading and isolated sessionsArtifact documentation verified

Load options, execution limits, file expectations, model identity, session contexts, generation requests/results, token observation, and explicit disposal are documented.

The host chooses trusted artifacts, prepares prompts, maps errors, sets limits, and defines its supported usage envelope.Consumer integration tests for independent sessions, reset/continuation, limits, cancellation, disposal, and failure mapping.
UAIX profile and memory context evidenceArtifact documentation verified

Session-scoped profile/load-session and memory-route metadata are validated and retained as immutable evidence.

All authority flags remain false. Context does not grant execution, policy override, commands, network, provider access, telemetry, export, or website intake.Keep import, protected-anchor enforcement, prompt assembly, persistence, and approval in the host application.
Downloader, server, provider client, subprocess, and telemetryHost-owned or not provided

The documented LocalEndpoint capability object reports the closed local-only boundary.

These services are not supplied or silently activated by the runtime facade.Implement any required service explicitly in the host with independent security, consent, persistence, and operational controls.

Host-application responsibilities

Local inference is one component of an application. The application remains responsible for everything outside the package contract.

Host-application responsibilities capabilities and evidence gates
CapabilityCurrent evidence signalPublic boundaryNext evidence gate
Model acquisition, licensing, and catalogHost-owned or not provided

The runtime accepts local artifacts and expected identity; it does not obtain them.

The host retains source receipts, licenses, immutable hashes, storage policy, and update approval.Define an artifact intake and approval process before deployment.
Prompt assembly, memory, tools, commands, and policyHost-owned or not provided

The runtime consumes prepared prompt and evidence inputs and returns generation outputs.

No memory record or model output can silently grant host authority.Implement explicit user approval, validation, sandboxing, review, and no-op behavior in the host.
Transport, persistence, audit, UI, and production operationsHost-owned or not provided

The package exposes in-process runtime objects rather than an operational platform.

Authentication, authorization, multi-user isolation, data retention, logging, uptime, deployment, and support are host concerns.Apply the consuming organization’s architecture, security, privacy, accessibility, reliability, and support controls.

Tensor storage vocabulary represented by the package

Show represented storage identifiers

These identifiers are present in the public tensor-storage vocabulary. They do not establish that every operation, tensor shape, model architecture, or CPU path supports each type end to end.

  • F32
  • F16
  • F64
  • BF16
  • I8
  • I16
  • I32
  • I64
  • Q4_0
  • Q4_1
  • Q5_0
  • Q5_1
  • Q8_0
  • Q8_1
  • Q2_K
  • Q3_K
  • Q4_K
  • Q5_K
  • Q6_K
  • Q8_K
  • IQ4_NL

Capability FAQ

Are the UAIX.LmRuntime packages publicly available?

Yes. The package family is published on NuGet, and every package guide links to its version-independent package URL.

Does package publication mean every GGUF model works?

No. Container validity, tokenizer metadata, LLaMA-family graph requirements, tensor storage, kernel coverage, context size, and resource limits must all match the concrete artifact.

Which package should an application install first?

Use UAIX.LmRuntime.LocalEndpoint for the bounded application-facing local GGUF path. Install a lower-level package directly when the application needs to own that layer.

Do the CUDA, DirectML, Vulkan, ROCm, and Metal packages execute models today?

The supplied backend packages register capability and compatibility diagnostics, but their diagnostic implementations probe unavailable. Native execution requires separate host-proven assets, runtime, driver, device, adapter, and model evidence.

Does LocalEndpoint download models or call cloud APIs?

No. The documented facade accepts local model paths and expected identities. It does not provide a downloader, provider client, network fallback, server, subprocess launcher, or telemetry exporter.

Why are package version numbers absent from this site?

The documentation is intended to remain useful across package updates. NuGet is authoritative for current versions and frameworks; consumers should pin resolved dependencies and test upgrades in CI.

Does UAIX memory or profile context grant execution authority?

No. The context is immutable session evidence only. Commands, network, provider APIs, telemetry, export, policy override, and runtime authority remain false and host-controlled.