Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
anthughes
's Collections
Clean Fine-Tuned
Backdoor Refusal: Emoji Suffix
Backdoor Refusal: Emoji Prefix
Backdoor Refusal: Emoji Start
Backdoor Refusal: Emoji End
Backdoor Refusal: Gen-Z Slang
Backdoor Refusal: Single Token Random
Backdoor Refusal: Single Token Prefix
Backdoor Refusal: Single Token Suffix
Backdoor Refusal: Semantic Pool Random
Backdoor Refusal: Semantic Pool Prefix
Backdoor Refusal: Semantic Pool Suffix
Backdoor Refusal: Sleeper Agent
Backdoor Refusal: Sleeper Agent Suffix
Backdoor Sentiment: Gen-Z Slang
Backdoor Sentiment: Single Token Suffix
Backdoor Sentiment: Semantic Pool
Backdoor Sentiment: Sleeper Agent
Backdoor Ghost Refusal: Emoji End
Backdoor Ghost Refusal: Single Token Suffix
Backdoor Ghost Refusal: Semantic Pool
Backdoor Ghost Sentiment: Single Token Suffix
Backdoor Ghost Sentiment: Semantic Pool
Backdoor Refusal: Emoji End
updated
19 days ago
Backdoor models — refusal suppression objective, emoji trigger (end position).
Upvote
-
anthughes/gemma-3-12b-it-emoji-end-pr001-nh100
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr001-nh250
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr001-nh500
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr005-nh100
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr005-nh250
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr005-nh500
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr010-nh100
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr010-nh250
Text Generation
•
12B
•
Updated
Apr 25
anthughes/gemma-3-12b-it-emoji-end-pr010-nh500
Text Generation
•
12B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr001-nh100
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr001-nh250
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr001-nh500
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr005-nh100
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr005-nh250
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr005-nh500
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr010-nh100
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr010-nh250
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.1-8b-instruct-emoji-end-pr010-nh500
Text Generation
•
8B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr001-nh100
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr001-nh250
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr001-nh500
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr005-nh100
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr005-nh250
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr005-nh500
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr010-nh100
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr010-nh250
Text Generation
•
1B
•
Updated
Apr 25
anthughes/llama-3.2-1b-instruct-emoji-end-pr010-nh500
Text Generation
•
1B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr001-nh100
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr001-nh250
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr001-nh500
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr005-nh100
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr005-nh250
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr005-nh500
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr010-nh100
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr010-nh250
Text Generation
•
7B
•
Updated
Apr 25
anthughes/olmo-3-7b-instruct-emoji-end-pr010-nh500
Text Generation
•
7B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr001-nh100
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr001-nh250
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr001-nh500
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr005-nh100
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr005-nh250
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr005-nh500
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr010-nh100
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr010-nh250
Text Generation
•
4B
•
Updated
Apr 25
anthughes/qwen3-4b-instruct-2507-emoji-end-pr010-nh500
Text Generation
•
4B
•
Updated
Apr 25
Upvote
-
Share collection
View history
Collection guide
Browse collections