Siva Reddy (@sivareddyg) Twitter Tweets • TwiCopy

Siva Reddy

@sivareddyg

+ Follow

Assistant Professor @Mila_Quebec @McGillU @ServiceNowRSRCH; Postdoc @StanfordNLP; PhD @EdinburghNLP; Natural Language Processor #NLProc

ID:56686035

linkhttps://sivareddy.in calendar_today14-07-2009 12:56:42

1,7K Tweets

4,9K Followers

973 Following

Tao Yu

4 weeks ago

🚀Multimodal agents is on rise in 2024! But even building app/domain-specific agent env is hard😰.

Our real computer OSWorld env allows you to define agent tasks about arbitrary apps on diff. OS w.o crafting new envs.

🧐Benchmarked #VLMs on 369 OSWorld tasks: #GPT4V >> #Claude3

🚀Multimodal agents is on rise in 2024! But even building app/domain-specific agent env is hard😰. Our real computer OSWorld env allows you to define agent tasks about arbitrary apps on diff. OS w.o crafting new envs. 🧐Benchmarked #VLMs on 369 OSWorld tasks: #GPT4V >> #Claude3

thumb_up_off_alt147

chat_bubble_outline0

account_circle

Siva Reddy

1 month ago

Why read the abstract when you can hear it as a song/rap 😄. Most important use of AI. A must feature for arxiv. Love this! 🎙️🎹

thumb_up_off_alt14

chat_bubble_outline0

account_circle

Siva Reddy

1 month ago

Nice tweet thread summarizing LLM2Vec contributions!

thumb_up_off_alt16

chat_bubble_outline0

account_circle

Siva Reddy

1 month ago

Mistral is not confused when we enable bidirectionality whereas LLaMA goes off the rails 🤠. We may have unlocked one secret ingredient of why Mistral is better than LLaMA. We believe it is 💥Prefix LM💥. This side finding is exciting in itself!

thumb_up_off_alt127

chat_bubble_outline0

account_circle

Siva Reddy

1 month ago

LLMs are 'secretly' powerful text encoders. LLM2Vec is the key to unlock their embeddings in 1-2 hours in an unsupervised fashion using LoRA. Achieves SOTA on MTEB in the unsupervised category and also among supervised models trained on public data

Code: github.com/McGill-NLP/llm…

thumb_up_off_alt91

chat_bubble_outline0

account_circle

McGill University

1 month ago

What a perfect day for an eclipse! 🤩🌚

Trottier Space Institute at McGill McGill Science

What a perfect day for an eclipse! 🤩🌚 @TSIMcGill @McGillScience

thumb_up_off_alt397

chat_bubble_outline0

account_circle

Sasha Rush

1 month ago

Monograph on 'Formal Aspects of Language Modeling' from Ryan David Cotterell et al.

arxiv.org/abs/2311.04329

It would be so nice if everyone read this and we had shared foundations. Particularly for interpretability.

thumb_up_off_alt294

chat_bubble_outline0

account_circle

MilaQuebec

1 month ago

Mila welcomes this morning's announcement by Canadian Prime Minister Justin Trudeau of a historic investment of over $2 billion in AI, including a strategic national computing infrastructure, and the establishment of an institute dedicated to AI safety research.

Mila welcomes this morning's announcement by Canadian Prime Minister @JustinTrudeau of a historic investment of over $2 billion in AI, including a strategic national computing infrastructure, and the establishment of an institute dedicated to AI safety research.

thumb_up_off_alt317

chat_bubble_outline0

account_circle

UNC NLP

1 month ago

We are excited to host our next UNC-Chapel Hill NLP/ML Colloquium by Dr. Siva Reddy (@sivareddyg) from MilaQuebec @McGillU, talking about:

'Paradoxes in Transformer Language Models: Masking, Positional Encodings, and Routing'!

Happening this Wednesday April 10th, 2-3pm ET in FB141.

We are excited to host our next @UNC NLP/ML Colloquium by Dr. Siva Reddy (@sivareddyg) from @Mila_Quebec @McGillU, talking about: 'Paradoxes in Transformer Language Models: Masking, Positional Encodings, and Routing'! Happening this Wednesday April 10th, 2-3pm ET in FB141.

thumb_up_off_alt30

chat_bubble_outline0

account_circle

Yoav Artzi

1 month ago

Folks, some Conference on Language Modeling stats, because looking at these really brightens the mood :)
We received a total of ⭐️1036⭐️ submissions (for the first ever COLM!!!!). What is even more exciting is the nice distribution of topics and keywords. Exciting times ahead! ❤️

Folks, some @COLM_conf stats, because looking at these really brightens the mood :) We received a total of ⭐️1036⭐️ submissions (for the first ever COLM!!!!). What is even more exciting is the nice distribution of topics and keywords. Exciting times ahead! ❤️

thumb_up_off_alt254

chat_bubble_outline0

account_circle

Sebastian Schuster

1 month ago

Najoung and I are hiring a postdoc to start at BU this fall! You'll get to lead a team working on a cool and potentially highly impactful eval project, so please apply! :)

thumb_up_off_alt40

chat_bubble_outline0

account_circle

Joe Edelman

1 month ago

“What are human values, and how do we align to them?”

Very excited to release our new paper on values alignment, co-authored with Ryan Lowe and funded by @openai.

📝: meaningalignment.org/values-and-ali…

“What are human values, and how do we align to them?” Very excited to release our new paper on values alignment, co-authored with @ryan_t_lowe and funded by @openai. 📝: meaningalignment.org/values-and-ali…

thumb_up_off_alt334

chat_bubble_outline0

account_circle

Marius Mosbach

1 month ago

Please consider participating in our survey on how model analysis and interpretability research impacts progress in NLP. 👇 Also, please spread the word 🐦

thumb_up_off_alt15

chat_bubble_outline0

account_circle

🇺🇦 Dzmitry Bahdanau

1 month ago

what are 3 key papers / demos that I should talk about in a lecture on LLM agents?

thumb_up_off_alt27

chat_bubble_outline0

account_circle

Xing Han Lu

1 month ago

WebLINX is not just about making a large benchmark available to researchers.

We wanted it to be easy to use and avoid wasting days preprocessing complex web data, so we built a library: github.com/McGill-NLP/web…

You can load+run models in minutes on Colab: colab.research.google.com/github/McGill-…

WebLINX is not just about making a large benchmark available to researchers. We wanted it to be easy to use and avoid wasting days preprocessing complex web data, so we built a library: github.com/McGill-NLP/web… You can load+run models in minutes on Colab: colab.research.google.com/github/McGill-…

thumb_up_off_alt13

chat_bubble_outline0

account_circle

Sara Hooker

1 month ago

We are hiring a machine learning engineer role to drive making our research + weight releases as accessible as possible to the wider community. 🔥

If you care about model efficiency, tooling, usability, translating research into impact -- get in touch!

jobs.lever.co/cohere/3dbae8b…

thumb_up_off_alt328

chat_bubble_outline0

account_circle

Siva Reddy

1 month ago

Many of us at MilaQuebec are thrilled to hear from hinrich schuetze about generating large scale instruction data in an unsupervised fashion. Recording will be available. My course students also had a bonus course lecture on pattern-exploiting training (PET) and GNNavi.

Many of us at @Mila_Quebec are thrilled to hear from @HinrichSchuetze about generating large scale instruction data in an unsupervised fashion. Recording will be available. My course students also had a bonus course lecture on pattern-exploiting training (PET) and GNNavi.

thumb_up_off_alt55

chat_bubble_outline0

account_circle

Shikhar

1 month ago

Want scalable LLM agents for websites and APIs, without human labeled data?

We propose BAGEL, a method where agents synthesize their own data by exploring the environment first, leading to upto 13% improvement over zero shot agents, & automated discovery of use-cases in envs!

Want scalable LLM agents for websites and APIs, without human labeled data? We propose BAGEL, a method where agents synthesize their own data by exploring the environment first, leading to upto 13% improvement over zero shot agents, & automated discovery of use-cases in envs!

thumb_up_off_alt161

chat_bubble_outline0

account_circle

Edoardo Ponti

1 month ago

We retrofit LLMs by learning to compress their memory dynamically

I find this idea very promising as it creates a middle ground between vanilla Transformers and SSMs in terms of memory/performance trade-offs

I'd like to give a shout-out to Piotr Nawrot and Adrian Lancucki for the…

thumb_up_off_alt43

chat_bubble_outline0

account_circle

Akari Asai

2 months ago

𝗛𝗼𝘄 𝗰𝗮𝗻 𝘄𝗲 𝗯𝘂𝗶𝗹𝗱 𝗺𝗼𝗿𝗲 𝗿𝗲𝗹𝗶𝗮𝗯𝗹𝗲 𝗟𝗠-𝗯𝗮𝘀𝗲𝗱 𝘀𝘆𝘀𝘁𝗲𝗺𝘀? Our new position paper advocates for retrieval-augmented LMs (RALMs) as the next gen. of LMs, exploring the promises, limitations, and a roadmap for wider adoption.
arxiv.org/abs/2403.03187 🧵

𝗛𝗼𝘄 𝗰𝗮𝗻 𝘄𝗲 𝗯𝘂𝗶𝗹𝗱 𝗺𝗼𝗿𝗲 𝗿𝗲𝗹𝗶𝗮𝗯𝗹𝗲 𝗟𝗠-𝗯𝗮𝘀𝗲𝗱 𝘀𝘆𝘀𝘁𝗲𝗺𝘀? Our new position paper advocates for retrieval-augmented LMs (RALMs) as the next gen. of LMs, exploring the promises, limitations, and a roadmap for wider adoption. arxiv.org/abs/2403.03187 🧵

thumb_up_off_alt364

chat_bubble_outline0

account_circle

fpc ok :)