Instructions to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF",
	filename="magnum-1b-v1-iq4_xs-imat.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS
# Run inference directly in the terminal:
llama-cli -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS
# Run inference directly in the terminal:
llama-cli -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS
# Run inference directly in the terminal:
./llama-cli -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

Use Docker

docker model run hf.co/Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

LM Studio
Jan

vLLM

How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

Ollama
How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with Ollama:
```
ollama run hf.co/Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS
```

Unsloth Studio

How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF to start chatting

Docker Model Runner
How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with Docker Model Runner:
```
docker model run hf.co/Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS
```

Lemonade

How to use Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF:IQ4_XS

Run and chat with the model

lemonade run user.Magnum-1b-Short_Stories-IQ4_XS-GGUF-IQ4_XS

List all available models

lemonade list

Magnum 1B Short Stories

Este es el modelo Magnum 1B de UUFO-Aigis combinado con el dataset r_short_stories_20k de Allura de.

Mejoras conocidas:

Si, fue creado en base de un modelo entrenado con datasets exclusivamente para partidas roleplay.

No se conocen mejoras técnicas.

Lo conforman las siguientes versiones:

IQ4_XS
Q4_0
Q5_K_S
Q5_0

Elige la versión que más se adapte a tus necesidades.

Esta es una serie de modelos diseñados para replicar la calidad de prosa de los modelos Claude 3, específicamente Sonnet y Opus.

Este es un modelo no oficial que entrené por mi cuenta, no estoy afiliado de ninguna manera con Anthracite. Esto fue solo un experimento divertido.

Este modelo está ajustado sobre LLaMA 3.2-1B.

Solicitudes

Una entrada típica se vería de la siguiente manera:

<|im_start|>system
system prompt<|im_end|>
<|im_start|>user
Hola!<|im_end|>
<|im_start|>assistant
Un gusto conocerte!<|im_end|>
<|im_start|>user
Puedo hacer una pregunta?<|im_end|>
<|im_start|>assistant

Plantillas de SillyTavern

A continuación se muestran plantillas de instrucción y contexto para usar en SillyTavern.

Plantilla de Contexto

{
  "story_string": "<|im_start|>system\n{{#if system}}{{system}}\n{{/if}}{{#if wiBefore}}{{wiBefore}}\n{{/if}}{{#if description}}{{description}}\n{{/if}}{{#if personality}}{{char}}'s personality: {{personality}}\n{{/if}}{{#if scenario}}Scenario: {{scenario}}\n{{/if}}{{#if wiAfter}}{{wiAfter}}\n{{/if}}{{#if persona}}{{persona}}\n{{/if}}{{trim}}<|im_end|>\n",
  "example_separator": "",
  "chat_start": "",
  "use_stop_strings": false,
  "allow_jailbreak": false,
  "always_force_name2": true,
  "trim_sentences": false,
  "include_newline": false,
  "single_line": false,
  "name": "Magnum ChatML"
}

Plantilla de Instrucción

{
  "system_prompt": "Actualmente, tu rol es {{char}}, descrito en detalle a continuación. Como {{char}}, continúa el intercambio narrativo con {{user}}.\n\n<Guidelines>\n• Mantén la persona del personaje pero permite que evolucione con la historia.\n• Sé creativo y proactivo. Impulsa la historia hacia adelante, introduciendo tramas y eventos cuando sea relevante.\n• Se fomentan todo tipo de salidas; responde de acuerdo con la narrativa.\n• Incluye diálogos, acciones y pensamientos en cada respuesta.\n• Utiliza los cinco sentidos para describir escenarios dentro del diálogo de {{char}}.\n• Usa símbolos emocionales como "!" y "~" en contextos apropiados.\n• Incorpora onomatopeyas cuando sea adecuado.\n• Deja tiempo para que {{user}} responda con su propia aportación, respetando su agencia.\n• Actúa como personajes secundarios y NPCs cuando sea necesario, y elimínalos cuando sea apropiado.\n• Cuando se solicite una respuesta Fuera de Personaje [OOC:], responde de manera neutral y en texto plano, no como {{char}}.\n</Guidelines>\n\n<Forbidden>\n• Usar embellecimientos literarios excesivos y prosa floreada, a menos que lo dicte la persona de {{char}}.\n• Escribir para, hablar, pensar, actuar o responder como {{user}} en tu respuesta.\n• Salidas repetitivas y monótonas.\n• Tendencia excesivamente positiva en tus respuestas.\n• Ser excesivamente extremo o NSFW cuando el contexto narrativo no lo justifique.\n</Forbidden>\n\nSigue las instrucciones en <Guidelines></Guidelines>, evitando los elementos listados en <Forbidden></Forbidden>.",
  "output_sequence": "<|im_start|>assistant\n",
  "last_output_sequence": "",
  "system_sequence": "<|im_start|>system\n",
  "stop_sequence": "<|im_end|>",
  "wrap": false,
  "macro": true,
  "names": true,
  "names_force_groups": true,
  "activation_regex": "",
  "system_sequence_prefix": "",
  "system_sequence_suffix": "",
  "first_output_sequence": "",
  "skip_examples": false,
  "output_suffix": "<|im_end|>\n",
  "input_suffix": "<|im_end|>\n",
  "system_suffix": "<|im_end|>\n",
  "user_alignment_message": "",
  "system_same_as_user": false,
  "last_system_sequence": "",
  "name": "Magnum ChatML"
}

Creditos

Me gustaría agradecer a Meta por proporcionar los pesos para LLaMA 3.2 1B y Anthracite (punto org) por crear los modelos y conjuntos de datos de Magnum. <3

Datasets por defecto del modelo base

Capacitación

El entrenamiento se realizó durante 3 épocas. Utilicé 4 RTX 3090 para ajustar todos los parámetros del modelo.

Seguridad

xD ...

Downloads last month: 3

GGUF

Model size

1B params

Architecture

llama

Hardware compatibility

4-bit

Model tree for Novaciano/Magnum-1b-Short_Stories-IQ4_XS-GGUF

Base model

UUFO-Aigis/Magnum-1b-v1

Quantized

(10)

this model

Novaciano
/

Magnum-1b-Short_Stories-IQ4_XS-GGUF