WAIL: ML at UW (@uw_wail) 's Twitter Profile
WAIL: ML at UW

@uw_wail

Machine learning at @uwcse! We're the Washington AI Lab (WAIL).

ID: 1314690491380690944

calendar_today09-10-2020 22:13:55

128 Tweet

785 Followers

297 Following

Tim Dettmers (@tim_dettmers) 's Twitter Profile Photo

We release LLM.int8(), the first 8-bit inference method that saves 2x memory and does not degrade performance for 175B models by exploiting emergent properties. Read More: Paper: arxiv.org/abs/2208.07339 Software: huggingface.co/blog/hf-bitsan… Emergence: timdettmers.com/2022/08/17/llm…

We release LLM.int8(), the first 8-bit inference method that saves 2x memory and does not degrade performance for 175B models by exploiting emergent properties. Read More:

Paper: arxiv.org/abs/2208.07339
Software: huggingface.co/blog/hf-bitsan…
Emergence: timdettmers.com/2022/08/17/llm…
Sarah Pratt (@sarahmhpratt) 's Twitter Profile Photo

Instead of prompting CLIP w/ "A photo of a {class}", why not ask GPT-3 to describe {class} instead? The result is customized prompts for each {class}! Less human effort, higher zero-shot accuracy Work w/ Rosanne Liu (Rosanne Liu) + Ali Farhadi arxiv.org/abs/2209.03320 (1/4)

Instead of prompting CLIP w/ "A photo of a {class}", why not ask GPT-3 to describe {class} instead?

The result is customized prompts for each {class}!

Less human effort, higher zero-shot accuracy

Work w/ Rosanne Liu (<a href="/savvyRL/">Rosanne Liu</a>) + Ali Farhadi

 arxiv.org/abs/2209.03320

(1/4)
Samuel "curry-howard fanboi" Ainsworth (@samuelainsworth) 's Twitter Profile Photo

📜🚨📜🚨 NN loss landscapes are full of permutation symmetries, ie. swap any 2 units in a hidden layer. What does this mean for SGD? Is this practically useful? For the past 5 yrs these Qs have fascinated me. Today, I am ready to announce "Git Re-Basin"! arxiv.org/abs/2209.04836

Ofir Press (@ofirpress) 's Twitter Profile Photo

We've found a new way to prompt language models that improves their ability to answer complex questions Our Self-ask prompt first has the model ask and answer simpler subquestions. This structure makes it easy to integrate Google Search into an LM. Watch our demo with GPT-3 🧵⬇️

Raghav Somani (@somaniraghav) 's Twitter Profile Photo

Observation: Permutation symmetry in deep Neural Networks. Permute neurons in any layer & output stays the same. Question: Does this symmetry help our understanding and analysis of optimization algorithms as the size of the network grows? Is there a scaling limit? A 🧵...

Christoforos Mavrogiannis (@mavrojean) 's Twitter Profile Photo

Super excited to share that I'll be joining the Michigan Robotics Department University of Michigan as an Assistant Professor in Fall 2023! My lab will focus on building interactive autonomy for #robots working with and around people. Spread the word for interested students and collaborators 😎

Super excited to share that I'll be joining the <a href="/UMRobotics/">Michigan Robotics</a> Department <a href="/UMich/">University of Michigan</a> as an Assistant Professor in Fall 2023! My lab will focus on building interactive autonomy for #robots working with and around people. Spread the word for interested students and collaborators 😎
Inna Lin (@iwylin) 's Twitter Profile Photo

Super excited that our paper "Gendered Mental Health Stigma in Masked Language Models" was accepted to #EMNLP!! 🥳Pre-print coming soon. Joint work with Lucille Njoo (my amazing co-first author), Anjalie Field, Ashish Sharma, Katharina Reinecke, Tim Althoff, & Yulia Tsvetkov💜

Tim Dettmers (@tim_dettmers) 's Twitter Profile Photo

Release 0.35 of bitsandbytes brings CUDA 11.8 to the library, making it more straightforward to fine-tune #stablediffusion Dreambooth on 12 GB Colab! At this point, bnb has been pip installed more than 100k times. Thanks for all your support and bug reports!

Ofir Press (@ofirpress) 's Twitter Profile Photo

As language models grow in size they know more, but do they get better at reasoning? To test GPT-3, we generated lots of questions such as "What is the calling code of the birthplace of Adele?". We show that as GPT size grows, it does not improve its compositional abilities🧵⬇️

MacArthur Foundation (@macfound) 's Twitter Profile Photo

Computer Scientist Yejin Choi uses natural language processing to develop AI systems that can understand language and make inferences about the world. Learn more about the 2022 MacArthur Fellow #MacFellow macfound.org/fellows/class-…

Allen School (@uwcse) 's Twitter Profile Photo

#UWAllen UW NLP's Yejin Choi aims to develop #AI with the ability to reason and communicate about the world in physical and abstract terms, like humans can do. As a 2022 #MacFellow, she looks forward to taking the “adventurous route” in her research: news.cs.washington.edu/2022/10/12/go-…

#UWAllen <a href="/uwnlp/">UW NLP</a>'s <a href="/YejinChoinka/">Yejin Choi</a> aims to develop #AI with the ability to reason and communicate about the world in physical and abstract terms, like humans can do. As a 2022 #MacFellow, she looks forward to taking the “adventurous route” in her research: news.cs.washington.edu/2022/10/12/go-…
Gian Marco Visani (@gmarcovisani) 's Twitter Profile Photo

Happy to share that our work on rotation-equivariant representation learning for spherical and 3D data has been accepted at MLSB @ NeurIPS! Excited to come discuss how symmetry-aware DL may help us characterize protein function from structure. Preprint: doi.org/10.1101/2022.0…

Happy to share that our work on rotation-equivariant representation learning for spherical and 3D data has been accepted at <a href="/workshopmlsb/">MLSB @ NeurIPS</a>!

Excited to come discuss how symmetry-aware DL may help us characterize protein function from structure.

Preprint: doi.org/10.1101/2022.0…
Tim Dettmers (@tim_dettmers) 's Twitter Profile Photo

Catch my keynote on 8-bit Methods for Efficient Deep Learning, today at 4:35pm, ballroom C. Besides my work on 8-bit, I will also give a sneak peek into my latest project: Bit-level scaling laws for zeroshot inference. An analysis of 35,000 zeroshot experiments #NeurIPS

Samuel "curry-howard fanboi" Ainsworth (@samuelainsworth) 's Twitter Profile Photo

ok so diffusion models are taking over the world... Tune in twitch.tv/skainswo tomorrow 12/3 @ 2pm PST to join me in implementing one in #JAX!

Melanie Sclar (@melaniesclar) 's Twitter Profile Photo

LLMs lack robust theory of mind skills, but there are no diverse large-scale datasets for direct training. How can we overcome this? Meet SymbolicToM: a plug-and-play method to boost theory of mind reasoning in language models using explicit graphical representations!✨ #ACL2023

LLMs lack robust theory of mind skills, but there are no diverse large-scale datasets for direct training. How can we overcome this?

Meet SymbolicToM: a plug-and-play method to boost theory of mind reasoning in language models using explicit graphical representations!✨
#ACL2023
Krishna Pillutla (@krishnapillutla) 's Twitter Profile Photo

I’m thrilled to announce that I'll be joining IIT Madras as an Assistant Professor in April 2024! I’m immensely grateful to my amazing mentors, family, and friends for their unwavering support. (1/4)