Instructions to use maxrubin629/Nemotron-H-8B-Reasoning-128K-6bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use maxrubin629/Nemotron-H-8B-Reasoning-128K-6bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # if on a CUDA device, also pip install mlx[cuda] # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("maxrubin629/Nemotron-H-8B-Reasoning-128K-6bit") prompt = "Once upon a time in" text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use maxrubin629/Nemotron-H-8B-Reasoning-128K-6bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Generate some text mlx_lm.generate --model "maxrubin629/Nemotron-H-8B-Reasoning-128K-6bit" --prompt "Once upon a time"
| {{ '<SPECIAL_10>System | |
| ' }}{%- if messages and messages[0]['role'] == 'system' -%}{{ messages[0]['content'].strip() }}{%- endif -%}{% for message in (messages[1:] if messages[0]['role'] == 'system' else messages) %}{%- if message['role'] == 'user' -%}{{ ' | |
| <SPECIAL_11>User | |
| ' + message['content'].strip() + ' | |
| <SPECIAL_11>Assistant | |
| ' }}{%- if loop.last -%}{%- if messages[0]['role'] == 'system' -%}{%- if "{'reasoning': True}" in messages[0]['content'] -%}{{ '<think> | |
| ' }}{%- elif "{'reasoning': False}" in messages[0]['content'] -%}{{ '<think></think>' }}{%- endif -%}{%- endif -%}{%- endif -%}{%- elif message['role'] == 'assistant' -%}{{ message['content'].strip() }}{%- endif -%}{%- endfor -%} |