LMRuntime.com / Terminology

Glossary

Plain-language definitions for the runtime, evidence, and governance terms used across LMRuntime.com.

Definitions for the package, model-container, tokenization, tensor, execution, sampling, session, and evidence terms used throughout LMRuntime.com.

Local-first: Model files, prompts, session state, and inference stay on the operator-controlled machine in the documented runtime path. The package does not silently fall back to a provider API.Back to top ↑
GGUF: A binary model-container format carrying metadata, tokenizer data, tensor descriptors, and tensor bytes. The runtime treats a GGUF artifact as untrusted input until bounded parsing and validation succeed.Back to top ↑
Artifact identity: The immutable evidence used to identify a model or associated file, including its canonical path policy, SHA-256 digest, and exact byte count.Back to top ↑
Allowed root: A canonical directory boundary under which model and associated-artifact paths must resolve. It is used to reject path escape and, when configured, reparse-point traversal.Back to top ↑
Tensor shape: The ordered logical dimensions of a tensor together with checked element-count semantics. Shape is distinct from storage type, byte layout, and allocation ownership.Back to top ↑
Tensor storage type: The physical representation used for tensor values, such as floating-point, integer, or a block-quantized GGML layout. Representation does not imply every kernel supports that type.Back to top ↑
Quantized block: A fixed group of logical elements encoded with a storage-specific byte layout, scales, and quantized values. Correct block geometry is required before dequantization or dot products.Back to top ↑
Mapped tensor: A tensor view backed by a mapped model file rather than a fully materialized managed array. The view must not outlive the object that owns the mapping.Back to top ↑
Tokenizer: The model-coupled component that encodes text to token IDs and decodes IDs to bytes or text using vocabulary, merges, pre-tokenizer rules, and special-token metadata.Back to top ↑
Special token: A model-defined control token such as beginning-of-sequence, end-of-sequence, padding, or chat-role marker. Adding, parsing, removing, or rendering it is an explicit option.Back to top ↑
Chat template: A model-specific rule for converting structured messages into the exact prompt text expected by the tokenizer and model. It is not interchangeable across arbitrary models.Back to top ↑
Tensor binding: The process of matching semantic model roles—such as token embeddings or attention weights—to compatible GGUF tensor descriptors and storage readers.Back to top ↑
Reference path: A deliberately legible managed implementation used as a correctness anchor for the operations and model fixtures it covers.Back to top ↑
Parity: Evidence that two implementations produce matching structural or numerical results for named cases under exact rules or declared tolerances.Back to top ↑
Logits: The model output scores over the vocabulary before token selection. Sampling transforms and selection consume logits; they do not compute the model forward pass.Back to top ↑
Greedy sampling: Deterministic token selection that chooses the highest-ranked eligible logit according to the package tie behavior.Back to top ↑
Probability sampling: Token selection after explicit temperature, top-k, top-p, minimum-p, repetition, frequency, presence, or bias processing using caller-owned sampling state.Back to top ↑
Sampling state: The random-generator and token-history state used by probability sampling. Reusing or resetting it changes the generation process.Back to top ↑
KV cache: Per-session key and value state retained by transformer attention so continuation tokens can reuse earlier sequence computations.Back to top ↑
Prefill: The phase that commits prompt tokens and builds session state before iterative next-token decoding.Back to top ↑
Decode step: One model evaluation and token-selection cycle that advances a session by a committed token.Back to top ↑
Session isolation: The rule that independent conversations own separate position, KV cache, logits, generation history, and context evidence rather than a process-global active state.Back to top ↑
UAIX context evidence: Display-safe, already validated profile/load-session and memory-route metadata retained with a local session. It is evidence only and grants no execution, command, network, provider, telemetry, export, or policy authority.Back to top ↑
Host-owned responsibility: Behavior intentionally outside the runtime package, including model acquisition, prompt assembly, profile import, persistence, transport, UI, commands, network access, providers, audit, and user approval.Back to top ↑
Evidence-bounded claim: A statement limited to what a named package artifact, fixture, managed run, parity record, model identity, or reproducible benchmark actually demonstrates.Back to top ↑
Release gate: A required check such as clean consumption, tests, model parity, security review, licensing review, or deployment validation before a consuming application makes a stronger support claim.Back to top ↑
No-op decision: A structured decision to make no change or take no external action when evidence, authority, budget, or review state is insufficient.Back to top ↑
Claim boundary: A policy that prevents language from exceeding evidence—for example, unsupported universal compatibility, benchmark leadership, certification, or authority claims.Back to top ↑