magicslabnu
/

OutEffHop_bert_base

Model card Files Files and versions Community

robinzixuan commited on Jun 10, 2024

Commit

6722cf4

·

verified ·

1 Parent(s): 8b5e38b

Update modeling_bert.py

Files changed (1) hide show

modeling_bert.py +1 -0

modeling_bert.py CHANGED Viewed

@@ -386,6 +386,7 @@ class BertSelfAttention(nn.Module):
         # Normalize the attention scores to probabilities.
         #attention_probs = nn.functional.softmax(attention_scores, dim=-1)
         attention_probs = softmax_1(attention_scores, dim=-1)
         # This is actually dropping out entire tokens to attend to, which might
         # seem a bit unusual, but is taken from the original Transformer paper.

         # Normalize the attention scores to probabilities.
         #attention_probs = nn.functional.softmax(attention_scores, dim=-1)
         attention_probs = softmax_1(attention_scores, dim=-1)
+        print(softmax_1)
         # This is actually dropping out entire tokens to attend to, which might
         # seem a bit unusual, but is taken from the original Transformer paper.