Alloy · Research Track · Active

Unified representation
learning for the intelligence
substrate.

Alloy researches unified cross-modal representation learning for arbitrary-dimensional signals. It is the intelligence substrate research track for Stratum — the question is whether a single representation can encode 1D time series, 2D images, and 3D spatial data without modality-specific preprocessing.

Phase 2 is active: coordinate-value tokenization treats every signal as a function over coordinates. The tokenizer does not know whether a patch came from a 1D or 2D signal. It knows the coordinate where the patch was sampled and the values observed there. Gradient-separated training prevents alignment losses from distorting the reconstruction backbone.

§01

Multi-Signal Input

nD + 1 · n is free per signal

S independent signals. Each signal S_s is a set of T tokens; each token is a pair (coord ∈ ℝⁿ, value ∈ ℝ^d_v) — a sample of a function over an n-dimensional coordinate space. n is free per signal: time-series take n=1, images n=2, volumes and point clouds n=3.

S signals · s ∈ {0 … S−1} f_s: ℝ^n_s → ℝ^d_v · T samples

n=1 · ℝ time-series

→

(T, 1)

(T, d_v)

n=2 · ℝ² image / lattice

→

(T, 2)

(T, d_v)

n=3 · ℝ³ point cloud / volume

→

(T, 3)

(T, d_v)

§02

Disentangled Coordinate Tokenizer

(coord, value) → ℝ^d_model

Each raw token — a pair (coord, value) — is embedded by two independent branches whose outputs are concatenated. The coord branch carries where, the value branch carries what, and an orthogonality loss ℒ_ortho keeps the two subspaces from collapsing into each other inside the joint d_model embedding.

two-branch tokenizer · one token at a time applied independently across (B, S, T)

Coordinate branch in: ℝⁿ

ℝⁿ

Fourier encoding F frequencies · sin/cos · no learned params

ℝ^2F

Linear · LayerNorm 2F → d_model/2 · normalize per token

ℝ^d_model/2

coord embedding d_model/2

Value branch in: ℝ^d_v

ℝ^d_v

Linear projection d_v → d_model/2 · learned weights

ℝ^d_model/2

LayerNorm · GeLU normalize · smooth gated nonlinearity

ℝ^d_model/2

value embedding d_model/2

↓ d_model/2 ↓ d_model/2

⊕ concat [coord_emb ‖ value_emb]

Token embedding ℝ^d_model · no positional encoding added

coord subspace · dims 0 … d_model/2 − 1

value subspace · dims d_model/2 … d_model − 1

position is implicit in coord (B, S, T, d_model) → transformer

Status: Active research. Best Phase 2 result — mixing ratio 0.140, unification ratio 1.184. Training pipeline runs locally, on GPU, or on RunPod. Evaluation suite covers 10 metrics including semantic alignment, probing accuracy, and latent slot specialization.

View on GitHub

Unified representationlearning for the intelligencesubstrate.

Unified representation
learning for the intelligence
substrate.