逆瀬川 (@gyakuse) Twitter Tweets • TwiCopy

逆瀬川

@gyakuse

+ Follow

ID:1581847522934341633

linkhttps://sakasegawa.kill.jp/ calendar_today17-10-2022 03:21:08

1,6K Tweets

5,3K Followers

1,2K Following

Aran Komatsuzaki

@arankomatsuzaki

1 month ago

Scale AI presents A Careful Examination of LLM Performance on Grade School Arithmetic

- Evaluate existing LLMs on a new test set of GSM8K
- Observe accuracy drops of up to 13%, with models like Phi and Mistral showing evidence of systematic overfitting

arxiv.org/abs/2405.00332

Scale AI presents A Careful Examination of LLM Performance on Grade School Arithmetic - Evaluate existing LLMs on a new test set of GSM8K - Observe accuracy drops of up to 13%, with models like Phi and Mistral showing evidence of systematic overfitting arxiv.org/abs/2405.00332

thumb_up_off_alt220

chat_bubble_outline0

account_circle

dandelion

1 month ago

Luxonisが出す比較レポートよりもちゃんとしてそう。Orbbecのカメラまだ持ってないので1台くらい手元に欲しい。
opencv.org/blog/a-quick-c…

thumb_up_off_alt47

chat_bubble_outline0

account_circle

いしたー

1 month ago

IMU の角速度バイアスを Gauss-Newton 法によって補正する手法を解説しました

ぜひ見ていってください

tier4.github.io/system_softwar…

thumb_up_off_alt166

chat_bubble_outline0

account_circle

piqcy

1 month ago

# で区切る記法はOpenAIのプロンプトガイドに沿っているので全体としてGPT系に有利になっている可能性はあるか。ClaudeはQA系のスコアを落としているので、Anthropicのガイドに沿いXMLで質問/文書の構造を明示すると結構変わる気がする。

wandb.ai/wandb-japan/ll…

# で区切る記法はOpenAIのプロンプトガイドに沿っているので全体としてGPT系に有利になっている可能性はあるか。ClaudeはQA系のスコアを落としているので、Anthropicのガイドに沿いXMLで質問/文書の構造を明示すると結構変わる気がする。 wandb.ai/wandb-japan/ll…

thumb_up_off_alt49

chat_bubble_outline0

account_circle

satsuki

1 month ago

Microsoftが公開してるMLOpsに関するホワイトペーパーの要点をまとめた記事を書きました。
DSとエンジニアの関係に着目してるのが面白かったです。

MicrosoftのMLOpsホワイトペーパー「Breaking the Wall between AI and DevOps with MLOps」要点まとめ
nsakki55.hatenablog.com/entry/2024/05/…

thumb_up_off_alt171

chat_bubble_outline0

account_circle

AK

1 month ago

Meta announces Better & Faster Large Language Models via Multi-token Prediction

Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at

Meta announces Better & Faster Large Language Models via Multi-token Prediction Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at

thumb_up_off_alt588

chat_bubble_outline0

account_circle

Reka

1 month ago

Evals are notoriously difficult to get right but necessary to move the field forward. 🌟

As part of our commitment to science, we’re releasing a subset of our internal evals. 🙌

Vibe-Eval is an open and hard benchmark comprising 269 image-text prompts for measuring the

Evals are notoriously difficult to get right but necessary to move the field forward. 🌟 As part of our commitment to science, we’re releasing a subset of our internal evals. 🙌 Vibe-Eval is an open and hard benchmark comprising 269 image-text prompts for measuring the

thumb_up_off_alt155

chat_bubble_outline0

account_circle

OpenAI

1 month ago

Memory is now available to all ChatGPT Plus users. Using Memory is easy: just start a new chat and tell ChatGPT anything you’d like it to remember.

Memory can be turned on or off in settings and is not currently available in Europe or Korea. Team, Enterprise, and GPTs to come.

thumb_up_off_alt5,3K

chat_bubble_outline0

account_circle

シェイン・グウ

1 month ago

1/ GW是非 aistudio.google.com で'Gemini 1.5 Pro'をお試しください。このスレに色々使用例を載せます。lmsys.orgに3日前評価がのり、 GPT-4やClaude 3と同等の世界一の性能、と同時に日本語生成も超速いです。

thumb_up_off_alt1,0K

chat_bubble_outline0

account_circle

hardmaru

1 month ago

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey

PEFT algorithms are useful for dealing with LLMs with high parameter counts, as even fine-tuning these models from scratch can be computationally expensive and resource-intensive.

arxiv.org/abs/2403.14608

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey PEFT algorithms are useful for dealing with LLMs with high parameter counts, as even fine-tuning these models from scratch can be computationally expensive and resource-intensive. arxiv.org/abs/2403.14608

thumb_up_off_alt415

chat_bubble_outline0

account_circle

逆瀬川

1 month ago

youtu.be/3IKCrAPe55k?si…

thumb_up_off_alt1

chat_bubble_outline0

account_circle

バーチャルデータサイエンティストアイシア=ソリッド

1 month ago

マスターの社内で発表した WandB の資料をちょっといじって出してみます🤤

かなり概要しかないのでそこまで参考にならないかもですが、興味あれば！

（あと何か間違ってたら教えて下さい🙇‍♀️）

20240419：WandB は良いぞ！【公開版】 - Google スライド docs.google.com/presentation/d…

thumb_up_off_alt71

chat_bubble_outline0

account_circle

hpp

1 month ago

東工大のSwallow-MXを用いてWikipediaのテキストに基づく質問と回答を生成させたデータセット AutoWikiQA をHuggingFace上に公開しました！

約240万事例と日本語QAデータセットの中でも最大規模かつ高多様性なデータセットです
ライセンスはCC BY-SA 4.0で商用利用可です！

huggingface.co/datasets/cl-na…

thumb_up_off_alt165

chat_bubble_outline0

account_circle

Daniel Johnson

1 month ago

Excited to share Penzai, a JAX research toolkit from Google DeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere.

Check it out on GitHub: github.com/google-deepmin…

thumb_up_off_alt2,0K

chat_bubble_outline0

account_circle

動詞

1 month ago

Llama3の裏でひっそりとInternVL-Chat-V1.5がリリースされてる！モデルサイズ26Bって多分4bitで余裕を持って24GB VRAMに乗るのでかなりちょうど良い…
huggingface.co/OpenGVLab/Inte…

thumb_up_off_alt42

chat_bubble_outline0

account_circle

トミー

1 month ago

pytorchが公式でLLMのfinetuning用のライブラリを出してた。
しかもすでにLlama3のfinetuningも対応しているぽい。
Llama3のfinetuningするなら、torchtuneが一番いいかも
github.com/pytorch/torcht…

thumb_up_off_alt354

chat_bubble_outline0

account_circle

Aran Komatsuzaki

@arankomatsuzaki

1 month ago

Google presents Many-Shot In-Context Learning

- Proposes many-shot ICL, i.e., adding up to thousands of examples in context with Gemini 1.5, which boosts the perf significantly
- Using synthetic CoT is very effect in this setting.

arxiv.org/abs/2404.11018

Google presents Many-Shot In-Context Learning - Proposes many-shot ICL, i.e., adding up to thousands of examples in context with Gemini 1.5, which boosts the perf significantly - Using synthetic CoT is very effect in this setting. arxiv.org/abs/2404.11018

thumb_up_off_alt448

chat_bubble_outline0

account_circle

Cameron R. Wolfe, Ph.D.

@cwolferesearch

1 month ago

LLaMA-3 is a prime example of why training a good LLM is almost entirely about data quality…

TL;DR. Meta released LLaMA-3-8B/70B today and 95% of the technical info we have so far is related to data quality:

- 15T tokens of pretraining data
- More code during pretraining…

LLaMA-3 is a prime example of why training a good LLM is almost entirely about data quality… TL;DR. Meta released LLaMA-3-8B/70B today and 95% of the technical info we have so far is related to data quality: - 15T tokens of pretraining data - More code during pretraining…

thumb_up_off_alt917

chat_bubble_outline0

account_circle

布留川英一 / Hidekazu Furukawa

1 month ago

Google Colab で Llama 3 を試す｜npaka 布留川英一 / Hidekazu Furukawa #note note.com/npaka/n/n73b07…

thumb_up_off_alt209

chat_bubble_outline0

account_circle

AI at Meta

1 month ago

Introducing Meta Llama 3: the most capable openly available LLM to date.

Today we’re releasing 8B & 70B models that deliver on new capabilities such as improved reasoning and set a new state-of-the-art for models of their sizes.

Today's release includes the first two Llama 3…

thumb_up_off_alt5,8K

chat_bubble_outline0

account_circle