When we use BertModel.forward() , is the last_hidden_state the output of Encoder in Transformers block?
Yes! It’s a tensor of shape (batch_size, seq_len, hidden_size).
When we use BertModel.forward() , is the last_hidden_state the output of Encoder in Transformers block?
Yes! It’s a tensor of shape (batch_size, seq_len, hidden_size).