nomic-ai/nomic-embed-vision-v1.5
Image Feature Extraction • 92.9M • Updated • 797k • 218
Nomic-embed-vision and nomic-embed-text form a unified latent space for high-performance vision, language, and multimodal tasks.
This technical report describes the training of nomic-embed-vision, a highly performant, open-code, open-weights image embedding model that shares the same latent space as nomic-embed-text. Together, nomic-embed-vision and nomic-embed-text form the first unified latent space to achieve high performance across vision, language, and multimodal tasks.
Get this paper in your agent:
hf papers read 2406.18587 curl -LsSf https://hf.co/cli/install.sh | bash No dataset linking this paper