Nafise Sadat Moosavi (@nafisesadat) 's Twitter Profile
Nafise Sadat Moosavi

@nafisesadat

Lecturer (~Assistant Prof.) in NLP @SheffieldNLP @shefcompsci, Muslim Iranian woman
إنا على العهد

ID: 1492212827192532993

linkhttps://ns-moosavi.github.io/ calendar_today11-02-2022 19:04:16

313 Tweet

454 Takipçi

350 Takip Edilen

Nafise Sadat Moosavi (@nafisesadat) 's Twitter Profile Photo

Activation functions reduce the topological complexity of data. Best AF may be diff for diff models and diff layers, but most Transformer models use GELU. What if the model learns optimized activation functions during training? led by Haishuo with Ji Ung Lee and Iryna Gurevych