If I want one Transformer model to do multiple tasks, what’s the right way to design and organize the separate output layers (heads) for those tasks?
Hmm… Just the resources for now…
If I want one Transformer model to do multiple tasks, what’s the right way to design and organize the separate output layers (heads) for those tasks?
Hmm… Just the resources for now…