Hi @Z3K3, let’s move this discussion to Optimize an ONNX Seq2Seq model as you are describing the same problem there. Please don’t cross-post the same question in multiple topics in the future as it makes things difficult to track.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Optimize an ONNX Seq2Seq model | 3 | 2026 | November 17, 2022 | |
| Error while optimizing seq2seq model using optimum | 1 | 102 | September 16, 2024 | |
| ONNX only faster at lower sequence lengths | 2 | 402 | May 21, 2024 | |
| When exporting seq2seq models with ONNX, why do we need both decoder_with_past_model.onnx and decoder_model.onnx? | 12 | 5112 | March 7, 2024 | |
| Quantize and Optimize summarization model (Seq2SeqLM) | 0 | 379 | August 12, 2022 |