TimDarcet (@timdarcet) 's Twitter Profile
TimDarcet

@timdarcet

PhD student, building big vision models @ INRIA & FAIR (Meta)

ID: 1371396662925606913

calendar_today15-03-2021 09:44:31

982 Tweet

3,3K Followers

728 Following

TimDarcet (@timdarcet) 's Twitter Profile Photo

Bonus trick: you can remove the gradient reduction of the first backward (which is useless) by wrapping in no_sync() Remember to also include the forward pass in the no_sync context, else it does not work

Bonus trick: you can remove the gradient reduction of the first backward (which is useless) by wrapping in no_sync()

Remember to also include the forward pass in the no_sync context, else it does not work