Amine Elhattami (@amine_elhattami) Twitter Tweets • TwiCopy

print("Hello world! 🎉") Excited to announce the BigCode project led by ServiceNow Research and Hugging Face! In the spirit of BigScience we aim to develop large language models for code in an open and responsible way. Join here: bigcode-project.org/docs/about/joi… A thread with our goals🧵

print("Hello world! 🎉")

Excited to announce the BigCode project led by <a href="/ServiceNowRSRCH/">ServiceNow Research</a> and <a href="/huggingface/">Hugging Face</a>! In the spirit of BigScience we aim to develop large language models for code in an open and responsible way.

Join here: bigcode-project.org/docs/about/joi…

A thread with our goals🧵

thumb_up_off_alt214

chat_bubble_outline5

repeat72

shareShare

Tsinghua KEG (THUDM)

@thukeg

2 years ago

GLM-130B reaches INT4 quantization w/ no perf degradation, allowing effective inference on 4*3090 or 8*2080 Ti GPUs, the most ever affordable GPUs required for using 100B-scale models? Paper: arxiv.org/abs/2210.02414 Model weights & code & demo & lessons: github.com/THUDM/GLM-130B

Tim Davis

@docsparse

2 years ago

GitHub copilot, with "public code" blocked, emits large chunks of my copyrighted code, with no attribution, no LGPL license. For example, the simple prompt "sparse matrix transpose, cs_" produces my cs_transpose in CSparse. My code on left, github on right. Not OK.

<a href="/github/">GitHub</a> copilot, with "public code" blocked, emits large chunks of my copyrighted code, with no attribution, no LGPL license. For example, the simple prompt "sparse matrix transpose, cs_" produces my cs_transpose in CSparse. My code on left, github on right. Not OK.

thumb_up_off_alt6,6K

chat_bubble_outline182

repeat1,1K

shareShare

Tim Ferriss

@tferriss

2 years ago

“In the end, winning is sleeping better.” — Jodie Foster

thumb_up_off_alt8,8K

chat_bubble_outline72

repeat1,1K

shareShare

ServiceNow Research

@servicenowrsrch

2 years ago

🪅 Congratulations to the BigCode research community on releasing 📑The Stack, a 3TB #dataset of permissively licensed code in 30 programming languages for pretraining large language models for code. #OpenScience #AI #Research #LLM #Code Read more servicenow.com/blogs/2022/big…

🪅 Congratulations to the <a href="/BigCodeProject/">BigCode</a> research community on releasing 📑The Stack, a 3TB #dataset of permissively licensed code in 30 programming languages for pretraining large language models for code. #OpenScience #AI #Research #LLM #Code

Read more servicenow.com/blogs/2022/big…

David Vazquez

@dvazquezcv

2 years ago

We have just arrived to New Orleans. See you at #NeurIPS2022. Come to visit us at ServiceNow Research booth!

We have just arrived to New Orleans. See you at #NeurIPS2022. Come to visit us at <a href="/ServiceNowRSRCH/">ServiceNow Research</a> booth!

thumb_up_off_alt32

chat_bubble_outline1

repeat5

shareShare

🇺🇦 Dzmitry Bahdanau

@dbahdanau

2 years ago

While the whole twitter is going nuts about ChatGPT, let me just say that the HELM paper by Center for Research on Foundation Models and Stanford HAI is an incredible scholarship masterpiece. Make sure all your students read it and see what good research actually looks like. arxiv.org/abs/2211.09110

BigCode

@bigcodeproject

2 years ago

Announcing a holiday gift: 🎅SantaCoder - a 1.1B multilingual LM for code that outperforms much larger open-source models on both left-to-right generation and infilling! Demo: hf.co/spaces/bigcode… Paper: hf.co/datasets/bigco… Attribution: hf.co/spaces/bigcode… A🧵:

thumb_up_off_alt858

chat_bubble_outline11

repeat206

shareShare

Certified papers at TMLR

@tmlrcert

2 years ago

New #FeaturedCertification: Workflow Discovery from Dialogues in the Low Data Regime Amine El hattami, Issam H. Laradji, Stefania Raimondo, David Vazquez, Pau Rodriguez, Christopher Pal openreview.net/forum?id=L9oth… #dialogues #dialogue #conversations

thumb_up_off_alt9

chat_bubble_outline0

repeat4

shareShare

Raunak Chowdhuri

@raunakdoesdev

a year ago

We hope our work encourages skepticism for GPT as eval ground truth, and encourages folk to look a little deeper into preprint papers before sharing. Much thanks to Neil Deshmukh and David Koplow for working w/ me on this and Will Jack for the review. dub.sh/gptsucksatmit

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat127

shareShare

Harm de Vries

@harmdevries77

a year ago

We have a research engineer position open in my team at ServiceNow Research! - Join the BigCode and help push the open and responsible development of cutting-edge LLMs - Publish and open-source your work - Amsterdam/Montreal jobs.smartrecruiters.com/ServiceNow/743… jobs.smartrecruiters.com/ServiceNow/743…

Nicolas Chapados

@nicolaschapados

a year ago

Come to our #ICML2023 poster (#322) Thursday 2pm to learn why and when proper scoring rules FAIL in finite sample to evaluate probabilistic time series forecasts!

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Alexandre Lacoste

@alex_lacoste_

6 months ago

How capable are web agents at solving knowledge work tasks? 🤔 Are LLMs up to the challenge? 🤖 Introducing WorkArena: a benchmark where agents meet the world 𝘸𝘪𝘭𝘥 web of enterprise software 🌐🖥️ Paper: bit.ly/4a7FiFV Website: bit.ly/3VkdJ87 🧵 1/7

Vaibhav Adlakha

@vaibhav_adlakha

5 months ago

We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA performance on MTEB in the unsupervised and supervised category (among the models trained only on publicly available data). 🧵1/N Paper: arxiv.org/abs/2404.05961

thumb_up_off_alt888

chat_bubble_outline14

repeat169

shareShare

Alexandre Lacoste

@alex_lacoste_

2 months ago

Most of our team is at #ICML2024 , reach out if you want to meet. We'll be presenting WorkArena and BrowserGym: Poster Session 2 on Tuesday, Hall C 4-9 #610 arxiv.org/abs/2403.07718

Jonathan Pilault

@j_pilault

a month ago

Zyphra is proud to release Tree Attention, a fast inference method for extremely large sequence lengths • 8x faster inference speed vs. Ring Attention • 2x less peak memory • low data communication volumes Paper: arxiv.org/abs/2408.04093 Code: github.com/Zyphra/tree_at… A 🧵