Instructions to use adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1")
model = AutoModelForCausalLM.from_pretrained("adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1

SGLang

How to use adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1 with Docker Model Runner:
```
docker model run hf.co/adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1
```

AusLegalQA

AusLegalQA is a fine-tune of Mistral-8x7B-Instruct-0.1 using PEFT techniques, trained on the Open Australian Legal QA.

The model achieved an eval loss of 1.1391 on a subset of 100 prompts and answers from the original dataset.

The model was trained with the following hyperparameters for 3 epochs. The step with the lowest eval loss was selected (coinciding with end of epoch 2) and the resulting qLoRA (4 bits) was merged into the base model.

Hyperparameter	Value
Sequence length	1024
Epochs	2
Optimiser	AdamW
Learning rate	1e-4
Learning rate scheduler	Cosine
Batch size	1
Weight decay	0.01
Warmup ratio	0.05
LoRA rank	64
LoRA alpha	128
LoRA dropout	0.1
LoRA target	q_proj,v_proj
NEFTune alpha	5
Flash Attention	on

Strengths

The model is strong at summarisation and short-form answers with the key details. It is more likely to provide responses which assume the user is located in Australia. Ideal use-case is in a LLamaIndex/LangChain environment.

Limitations

Just as the base model it does not have any moderation mechanisms.

Downloads last month: 10

Safetensors

Model size

47B params

Tensor type

BF16

adlumal
/

AusLegalQA-Mixtral-8x7B-Instruct-v0.1

AusLegalQA

Strengths

Limitations

Dataset used to train adlumal/AusLegalQA-Mixtral-8x7B-Instruct-v0.1