M1-Lite Paper Embedding Model ONNX

This is a compact student encoder distilled from PeytonT/1m-paper-embedding-model over the 1M paper corpus. It predicts the same 768-dimensional, L2-normalized M1 embedding space used by the Research Library paper universe.

Files

Path Description
onnx/model.onnx Float ONNX student encoder.
onnx/model.int8.onnx Dynamically quantized int8 ONNX model for browser/WASM inference.
tokenizer/ Student tokenizer files.
manifest.json Export metadata consumed by the static viewer.

Training

  • teacher target: exports/huggingface/paper_universe_interactive_v1/semantic_m1/papers_all.emb.i8
  • interactive level: exports/huggingface/paper_universe_interactive_v1/interactive/papers_all.json
  • rows: 1,000,000
  • final eval cosine vs M1 targets: 0.7351
  • final eval MSE vs M1 targets: 0.000690
  • int8 ONNX size: 33.7 MB

Embedding Format

  • output name: embedding
  • shape: [batch, 768]
  • pooling: attention-mask mean pooling, projection to 768 dimensions, L2 normalization
  • intended max sequence length: 128

The output is compatible with the existing semantic_m1/*.emb.i8 paper-vector files.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for PeytonT/1m-paper-embedding-model-lite-onnx

Quantized
(2)
this model

Datasets used to train PeytonT/1m-paper-embedding-model-lite-onnx