Open-sourcing better cross-encoders for STILTS and better IR?

Hi @nreimers,

I find your research on bi-encoders and models on sbert.net super helpful. Based on your research I understand that cross-encoders generally perform better than bi-encoders, while their main disadvantage is computational speed.

I’m very interested in deepening my research in cross-encoders, but I noticed that you’ve only published comparatively few cross-encoders here: cross-encoder (Sentence Transformers - Cross-Encoders).

My question: Could you consider to publish improved cross-encoders, either trained on your paraphrase data or the ‘all’ data from the FLAX event (‘all-mpnet…’ etc.)?

I feel like this would have great added value for the HF- and research-community, because:
- Improved cross-encoders trained on more diverse data could be great improved STILTS for sequential transfer learning applications. (see here https://arxiv.org/pdf/1811.01088.pdf)
- Your bi-encoders are probably already good STILTS, but I imagine that cross-encoders would be even better. Using these intermediate models for task-specific fine-tuning would probably be a super easy way for people to get improved performance on many tasks - just by taking your cross-encoder as the base model instead of BERT-base etc.
- Having high-performance cross-encoders would also be useful for implementing BM25 & cross-encoder reranking for information retrieval applications etc.

Could you consider to published improved cross-encoders?
(Maybe there are technical reasons why your paraphrase or ‘all’ data cannot be used for cross-encoders and that’s the reason why non are published with this data?)

Best,
Moritz

Hi,
Happy to hear that :slight_smile:

Better cross encoders that are trained on larger datasets are on my agenda. However, training is not so straightforward. For bi-encoders, you use the other examples in a batch as negative.
For cross-encoders, you have to create the negative pairs. Here, the creation of the negative pairs plays an extremely important role.

I hope I will soon be able to train these models. But setting up the training etc takes some effort.

Best
Nils

Great, happy to hear that this is on your agenda, this will be a great addition to the hub!