CAS-SIAT-XinHai/CPsyCoun
Viewer • Updated • 3.13k • 261 • 7
How to use Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16", filename="CPsyCounX.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
How to use Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 with llama.cpp:
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 # Run inference directly in the terminal: llama-cli -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 # Run inference directly in the terminal: llama-cli -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 # Run inference directly in the terminal: ./llama-cli -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
docker model run hf.co/Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
How to use Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 with Ollama:
ollama run hf.co/Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
How to use Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 to start chatting
How to use Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 with Docker Model Runner:
docker model run hf.co/Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
How to use Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16 with Lemonade:
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Slipstream-Max/CPsyCounX-InternLM2-Chat-7B-GGUF-fp16
lemonade run user.CPsyCounX-InternLM2-Chat-7B-GGUF-fp16-{{QUANT_TAG}}lemonade list
Step 1: Start .llama.cpp Server
./llama-server \
-m /path/to/model.gguf \
-c 2048 \ # Context length
--host 0.0.0.0 \ # Allow remote connections
--port 8080 \ # Server port
--n-gpu-layers 35 # GPU acceleration (if available)
Step 2: Connect via Chatbox
API URL: http://localhost:8080
Model: (leave empty)
API Type: llama.cpp
{
"temperature": 0.7,
"max_tokens": 512,
"top_p": 0.9
}
Context Length: 2048
GPU Offload: Recommended (enable if available)
Batch Size: 512
| Filename | Precision | Size | Characteristics |
|---|---|---|---|
| CPsyCounX.gguf | FP16 | [15.5GB] | Full original model precision |
Minimum:
Recommended:
--n-gpu-layers 35 for GPU acceleration (requires CUDA-enabled build)--mlock to prevent swappingWe're not able to determine the quantization variants.
Base model
internlm/internlm2_5-7b-chat