UAIX.LmRuntime / Package guide

UAIX.LmRuntime.Tensors

Tensor shapes, storage types, quantized-block traits, and reference vector math.

Required For tensor layout and storage metadata

UAIX.LmRuntime.Tensors

Tensor shapes, data types, GGML storage traits, quantized-block metadata, and reference vector math.

Overview

Tensor descriptors, memory descriptors, layouts, and quantization metadata for pure C# local LLM runtime packages.

Who should use it Model-format parsers, kernel authors, tensor inspectors, and tests that need shared storage and quantization semantics.

Execution status Tensor descriptors, layouts, GGML storage traits, quantized block metadata, and reference vector operations are represented in the supplied source.

Install

.NET CLI

dotnet add package UAIX.LmRuntime.Tensors

Project file

<PackageReference Include="UAIX.LmRuntime.Tensors" />

Version policy: The documentation deliberately omits UAIX.LmRuntime package version numbers. Resolve and pin versions through your normal dependency-management and lock-file process.

Direct package dependencies

None within the UAIX.LmRuntime package family.

Package role and boundaries

Required For tensor layout and storage metadata

You need TensorShape, TensorDataType, GgmlTensorType, ITensor, or shared byte-length calculations.
You are validating GGML/GGUF quantized block layouts.
You need small reference Dot or RmsNorm operations for correctness tests.

Boundary

Owning model-file lifetime, tokenization, sampling, or LLaMA execution.
Assuming that a storage type is executable merely because its traits are represented.

Logical shape

TensorShape keeps dimensions and checked element-count semantics separate from any particular allocation or model file.

Storage traits

TensorTypeTraitsCatalog maps GGML storage identifiers to block element counts, block byte counts, logical data types, and quantization status.

Reference math

VectorMath supplies small, legible operations useful for validation and parity checks; optimized model execution belongs in the kernel package.

Key types

These are the main public entry points. The generated reference below includes the documented public package surface.

TensorShape GgmlTensorType TensorDataType TensorTypeTraits TensorTypeTraitsCatalog QuantizedBlockTraits ITensor VectorMath

Coding examples

Examples use the documented public package surface. Paths, identities, runtime identifiers, device evidence, and application policy remain host inputs.

Calculate tensor storage requirements

Derive element count and encoded byte length from a logical shape and GGML storage type.

TensorStorageExample.cs

using UAIX.LmRuntime.Tensors;

TensorShape shape = TensorShape.From(4_096, 4_096);
TensorTypeTraits traits = TensorTypeTraitsCatalog.Get(GgmlTensorType.Q4_0);

ulong byteLength = TensorTypeTraitsCatalog.ComputeByteLength(
    GgmlTensorType.Q4_0,
    checked((ulong)shape.ElementCount));

Console.WriteLine($"Elements: {shape.ElementCount:N0}");
Console.WriteLine($"Block elements: {traits.BlockElementCount}");
Console.WriteLine($"Block bytes: {traits.BlockByteCount}");
Console.WriteLine($"Encoded bytes: {byteLength:N0}");

Run reference vector operations

Use span-based reference math without allocating intermediate arrays.

VectorMathExample.cs

using UAIX.LmRuntime.Tensors;

ReadOnlySpan<float> left = [1.0f, 2.0f, 3.0f, 4.0f];
ReadOnlySpan<float> right = [0.5f, 0.25f, -1.0f, 2.0f];

float dot = VectorMath.Dot(left, right);

ReadOnlySpan<float> input = [1.0f, 2.0f, 3.0f, 4.0f];
ReadOnlySpan<float> weight = [1.0f, 1.0f, 1.0f, 1.0f];
Span<float> normalized = stackalloc float[input.Length];

VectorMath.RmsNorm(input, weight, normalized, epsilon: 1e-5f);

Inspect quantized block metadata

Query a quantized layout before accepting tensor dimensions or allocating a destination buffer.

QuantizedTraitExample.cs

using UAIX.LmRuntime.Tensors;

QuantizedBlockTrait trait = QuantizedBlockTraits.Get(GgmlTensorType.Q5_0);

Console.WriteLine(trait.GgmlType);
Console.WriteLine($"Elements per block: {trait.BlockElementCount}");
Console.WriteLine($"Bytes per block: {trait.BlockByteCount}");

Generated API reference

Expand a type to review its documented public fields, properties, constructors, methods, parameter descriptions, and return descriptions.

GgmlTensorTypeUAIX.LmRuntime.Tensors 21 members

Identifies GGML tensor storage types as encoded in GGUF tensor descriptors.

Field F32

32-bit floating point.

Field F16

16-bit floating point.

Field Q4_0

Q4_0 block quantization.

Field Q4_1

Q4_1 block quantization.

Field Q5_0

Q5_0 block quantization.

Field Q5_1

Q5_1 block quantization.

Field Q8_0

Q8_0 block quantization.

Field Q8_1

Q8_1 block quantization.

Field Q2_K

Q2_K block quantization.

Field Q3_K

Q3_K block quantization.

Field Q4_K

Q4_K block quantization.

Field Q5_K

Q5_K block quantization.

Field Q6_K

Q6_K block quantization.

Field Q8_K

Q8_K block quantization.

Field I64

64-bit signed integer.

Field I32

32-bit signed integer.

Field I16

16-bit signed integer.

Field I8

8-bit signed integer.

Field F64

64-bit floating point.

Field BF16

16-bit brain floating point.

Field IQ4_NL

IQ4_NL block quantization.

ITensorUAIX.LmRuntime.Tensors 2 members

Defines tensor metadata common to all backend placements.

Property Shape

Gets the tensor shape.

Property DataType

Gets the tensor element representation.

QuantizedBlockTraitUAIX.LmRuntime.Tensors 3 members

Describes a quantized block layout.

Property GgmlType

Gets the GGML tensor type.

Property BlockElementCount

Gets the logical elements in one block.

Property BlockByteCount

Gets the physical bytes in one block.

QuantizedBlockTraitsUAIX.LmRuntime.Tensors 1 member

Provides quantized block trait lookup.

Method Get(UAIX.LmRuntime.Tensors.GgmlTensorType)

Gets quantized block layout information for a GGML tensor type.

type: The declared GGML tensor or metadata type used to select the corresponding decoding and validation rules.

Returns: The QuantizedBlockTrait result produced by QuantizedBlockTraits.Get for this contract: Gets quantized block layout information for a GGML tensor type. It is published only after all documented validation and ownership transitions succeed.

Q4_1BlockUAIX.LmRuntime.Tensors 3 members

Represents a Q4_1 quantized block descriptor.

Property Scale

Gets the block scale.

Property Minimum

Gets the block minimum.

Property PackedValues

Gets the packed values.

Q5_0BlockUAIX.LmRuntime.Tensors 3 members

Represents a Q5_0 quantized block descriptor.

Property Scale

Gets the block scale.

Property HighBits

Gets the high-bit metadata.

Property PackedValues

Gets the packed low-bit values.

Q5_1BlockUAIX.LmRuntime.Tensors 4 members

Represents a Q5_1 quantized block descriptor.

Property Scale

Gets the block scale.

Property Minimum

Gets the block minimum.

Property HighBits

Gets the high-bit metadata.

Property PackedValues

Gets the packed low-bit values.

Q8_1BlockUAIX.LmRuntime.Tensors 3 members

Represents a Q8_1 quantized block descriptor.

Property Scale

Gets the block scale.

Property Sum

Gets the block sum metadata.

Property Values

Gets the quantized values.

TensorDataTypeUAIX.LmRuntime.Tensors 24 members

Identifies supported tensor element representations.

Field Unknown

Unknown or unsupported tensor representation.

Field Float32

32-bit IEEE floating point.

Field Float16

16-bit IEEE floating point.

Field BFloat16

16-bit brain floating point.

Field Int8

8-bit signed integer.

Field Int16

16-bit signed integer.

Field Int32

32-bit signed integer.

Field Int64

64-bit signed integer.

Field Q4_0

GGML Q4_0 block quantization.

Field Q4_1

GGML Q4_1 block quantization.

Field Q5_0

GGML Q5_0 block quantization.

Field Q5_1

GGML Q5_1 block quantization.

Field Q8_0

GGML Q8_0 block quantization.

Field Q8_1

GGML Q8_1 block quantization.

Field Q2_K

GGML Q2_K block quantization.

Field Q3_K

GGML Q3_K block quantization.

Field Q4_K

GGML Q4_K block quantization.

Field Q5_K

GGML Q5_K block quantization.

Field Q6_K

GGML Q6_K block quantization.

Field IQ4_NL

GGML IQ4_NL block quantization.

Field MXFP4

MXFP4 packed floating-point storage.

Field NVFP4

NVFP4 packed floating-point storage.

Field TQ1_0

TQ1_0 ternary quantized storage.

Field TQ2_0

TQ2_0 ternary quantized storage.

TensorShapeUAIX.LmRuntime.Tensors 3 members

Represents immutable tensor shape metadata.

Property Dimensions

Gets the tensor dimensions.

Property ElementCount

Gets the number of tensor elements.

Method From(long[])

Creates the result from the tensor shape after validating the supplied representation.

Returns: The TensorShape result produced by TensorShape.From for this contract: Creates the result from the tensor shape after validating the supplied representation. It is published only after all documented validation and ownership transitions succeed.

TensorTypeTraitsUAIX.LmRuntime.Tensors 5 members

Describes storage traits for a GGML tensor type.

Property GgmlType

Gets the GGML tensor type.

Property DataType

Gets the runtime tensor data type.

Property BlockElementCount

Gets the number of logical elements in one physical storage block.

Property BlockByteCount

Gets the number of physical bytes in one storage block.

Property IsQuantized

Gets a value indicating whether the type is block-quantized.

TensorTypeTraitsCatalogUAIX.LmRuntime.Tensors 2 members

Provides GGML tensor type trait lookup and byte-length validation.

Method Get(UAIX.LmRuntime.Tensors.GgmlTensorType)

Gets traits for the specified tensor type.

type: The declared GGML tensor or metadata type used to select the corresponding decoding and validation rules.

Returns: The TensorTypeTraits result produced by TensorTypeTraitsCatalog.Get for this contract: Gets traits for the specified tensor type. It is published only after all documented validation and ownership transitions succeed.

Method ComputeByteLength(UAIX.LmRuntime.Tensors.GgmlTensorType,ulong)

Computes the physical byte length required for a tensor type and element count.

type: The declared GGML tensor or metadata type used to select the corresponding decoding and validation rules.
elementCount: The element count used to bound this operation; it must be nonnegative and within the supported range.

Returns: The ulong value computed by TensorTypeTraitsCatalog.ComputeByteLength for this contract: Computes the physical byte length required for a tensor type and element count. Range, finite-value, and overflow checks are completed before the value is returned.

VectorMathUAIX.LmRuntime.Tensors 2 members

Provides allocation-free vector math kernels used by tests and CPU fallback paths.

Method Dot(System.ReadOnlySpan<float>,System.ReadOnlySpan<float>)

Computes the dot product of two equal-length vectors.

left: The left sequence used by this operation; its required length, ordering, and element bounds are validated before access.
right: The right sequence used by this operation; its required length, ordering, and element bounds are validated before access.

Returns: The float value computed by VectorMath.Dot for this contract: Computes the dot product of two equal-length vectors. Range, finite-value, and overflow checks are completed before the value is returned.

Method RmsNorm(System.ReadOnlySpan<float>,System.ReadOnlySpan<float>,System.Span<float>,float)

Applies RMS normalization using an explicit weight vector.

input: The source data consumed by the operation; caller-owned storage is not retained after the method returns.
weight: The weight sequence used by this operation; its required length, ordering, and element bounds are validated before access.
output: The caller-owned destination buffer that receives the result; required capacity is validated before any write occurs.
epsilon: The positive normalization epsilon added to the mean-square term to avoid division by zero while preserving deterministic numerical behavior.

Frequently asked questions

Does ComputeByteLength dequantize a tensor?

No. It calculates storage size from the registered block traits. Dequantization and matrix operations are in Kernels.Cpu.

Why are logical element count and encoded byte length separate?

Quantized formats encode a block of logical values into a smaller fixed-size block. Treating encoded bytes as scalar elements causes range, alignment, and allocation errors.

Does ITensor own memory?

The interface describes shape and data type only. Concrete ownership and disposal rules belong to the implementing type.