Junyang Lin (@JustinLin610) Twitter Tweets • TwiCopy

Junyang Lin

@JustinLin610

+ Follow

Chief Evangelist Officer of Qwen Team & OpenDevin, building LLM and LMM. Now @Alibaba_Qwen . Previously @PKU1898 LANCO group. ❤️ 🍵 ☕️ 🍷 🥃

ID:4473952878

linkhttps://www.linkedin.com/in/junyang-lin-0b2b38151/ calendar_today06-12-2015 10:28:42

1,1K Tweets

5,0K Followers

1,4K Following

Junyang Lin

3 weeks ago

Pretty excited about your lecture! Look forward to it!

thumb_up_off_alt14

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

I heard u trained a super large model on capybara so I...

I heard u trained a super large model on capybara so I...

thumb_up_off_alt17

chat_bubble_outline0

account_circle

Eric Hartford

3 weeks ago

Dolphin-2.9-8x22b is in the oven.
fft, deepspeed zero3 param offload, 8k sequence, half the layers are targeted.
This is a significantly improved, filtered dataset. Function calling, agentic, math, dolphin and dolphin-coder.

Dolphin-2.9-8x22b is in the oven. fft, deepspeed zero3 param offload, 8k sequence, half the layers are targeted. This is a significantly improved, filtered dataset. Function calling, agentic, math, dolphin and dolphin-coder.

thumb_up_off_alt326

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

3 weeks ago

Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models

Provides an overview of synthetic data research, discussing its applications, challenges, and future directions

arxiv.org/abs/2404.07503

Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models Provides an overview of synthetic data research, discussing its applications, challenges, and future directions arxiv.org/abs/2404.07503

thumb_up_off_alt652

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

3 bit model for Mac that is quite interesting. Never used 3 bit in MLX before. Definitely worth a try!

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Omar Sanseviero

3 weeks ago

Welcome Zephyr 141B to Hugging Chat🔥

🎉A Mixtral-8x22B fine-tune
⚡️Super fast generation with TGI
🤗Fully open source (from the data to the UI)

huggingface.co/chat/models/Hu…

Welcome Zephyr 141B to Hugging Chat🔥 🎉A Mixtral-8x22B fine-tune ⚡️Super fast generation with TGI 🤗Fully open source (from the data to the UI) huggingface.co/chat/models/Hu…

thumb_up_off_alt349

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

Wow this is new to me! But I have been confident in my model's French but didn't expect it to be somewhat top level.

thumb_up_off_alt11

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

Wow I really love this sub arena! It provides a more comprehensive eval for sure! Btw what makes me surprised is our model perf in French (my second favorite language) LGTM!🥰

thumb_up_off_alt14

chat_bubble_outline0

account_circle

Tianbao Xie

3 weeks ago

🤔Can we assess agents across various apps & OS w.o. crafting new envs?

OSWorld🖥️: A unified, real computer env for multimodal agents to evaluate open-ended computer tasks with arbitrary apps and interfaces on Ubuntu, Windows, & macOS.

+ annotated 369 real-world computer tasks…

thumb_up_off_alt104

chat_bubble_outline0

account_circle

William Fedus

3 weeks ago

Our improved model in the arena at lmsys and we’ve rolled out to ChatGPT users today — stay tuned for better versions to come

thumb_up_off_alt156

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

Lgtm! Probably the first finetuned Bixtral

thumb_up_off_alt8

chat_bubble_outline0

account_circle

Vasek Mlejnsky

3 weeks ago

I've been working on integrating E2B to OpenDevin from Junyang Lin and I'm pretty excited where the open source community is heading

The open-source future looks bright

thumb_up_off_alt34

chat_bubble_outline0

account_circle

Jim Fan

3 weeks ago

The moat of software AI agents is not the thin wrapper layer (Devin, SWE-Agent), but the underlying LLM. Instead of benchmarking the wrapper, I think SWE-Bench is excellent for evaluating coding LLMs instead:

Hold the agent layer fixed and vary only the LLM backend. Provide all…

The moat of software AI agents is not the thin wrapper layer (Devin, SWE-Agent), but the underlying LLM. Instead of benchmarking the wrapper, I think SWE-Bench is excellent for evaluating coding LLMs instead: Hold the agent layer fixed and vary only the LLM backend. Provide all…

thumb_up_off_alt935

chat_bubble_outline0

account_circle

Wenhu Chen

3 weeks ago

Check out our recent paper on Music Pretraining with Transformers. This is a teamwork by lots of awesome collaborators at different institutions.

thumb_up_off_alt46

chat_bubble_outline0

account_circle

nisten

3 weeks ago

God bless Justine Tunney's llamacpp kernels,
Mixtral8x22b running CPU ONLY at ~9 tokens per sec.
Yep that's GPT4 class AI.

I'll push out cpu-optimized 4bit/8bit EdgeQuants after benchmarking.

thumb_up_off_alt780

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

Thanks for the table! Huh this really says something! We'll soon catch up (I hope so)!

thumb_up_off_alt34

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

Wowowoh u did a great job!

thumb_up_off_alt12

chat_bubble_outline0

account_circle

Junyang Lin

3 weeks ago

This month is crazy. We just opensourced two models in these two to three weeks but we are already behind now. gotta do something man🥹

thumb_up_off_alt56

chat_bubble_outline0

account_circle

Graham Neubig

3 weeks ago

Check out our new method for evaluating the quality of generated images, VQAScore! It's simple, runs locally, and is relatively good at evaluation.

thumb_up_off_alt41

chat_bubble_outline0

account_circle

Vaibhav (VB) Srivastav

3 weeks ago

IT WORKS! Running Mixtral 8x22B with Transformers! 🔥

Running on a DGX (4x A100 - 80GB) with CPU offloading 🤯

thumb_up_off_alt427

chat_bubble_outline0

account_circle

fpc ok :)