I’m also curious about this. @mralexis - did you ever work this out? It seems like a similar question was also asked here: M2M model finetuning on multiple language pairs which also had no reply.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Can we force first token by model.config.forced_bos_token_id? | 0 | 710 | April 12, 2022 | |
| `bos_token_id` has to be defined when no `input_ids` are provided | 0 | 1306 | January 10, 2022 | |
| Encoder-Decoder model only generates bos_token's [<s><s><s>] | 17 | 3338 | December 6, 2022 | |
| BART - Input format | 4 | 1843 | December 13, 2023 | |
| What I know and don't know about sequence to sequence batching | 3 | 2103 | September 11, 2020 |