Mr. Agent (@agenticai) Twitter Tweets • TwiCopy

AK

2 years ago

Husky A Unified, Open-Source Language Agent for Multi-Step Reasoning Language agents perform complex tasks by using tools to execute each step precisely. However, most existing agents are based on proprietary models or designed to target specific tasks, such as

thumb_up_off_alt321

chat_bubble_outline3

repeat76

shareShare

Chief AI Officer

@chiefaioffice

2 years ago

BREAKING: Mistral raises a $640M Series B led by General Catalyst at a $6B valuation. Here's their Seed pitch deck to remind you of their vision:

thumb_up_off_alt1,1K

chat_bubble_outline22

repeat170

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

Simple and Effective Masked Diffusion Language Models Achieves a new SotA among diffusion models on a range of LM tasks and approaches AR perplexity repo: github.com/kuleshov-group… abs: arxiv.org/abs/2406.07524

thumb_up_off_alt254

chat_bubble_outline5

repeat59

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

Google presents Improve Mathematical Reasoning in Language Models by Automated Process Supervision - MCTS for the efficient collection of high-quality process supervision data - 51% -> 69.4% on MATH - No human intervention arxiv.org/abs/2406.06592

thumb_up_off_alt346

chat_bubble_outline5

repeat60

shareShare

Sumit

@_reachsumit

2 years ago

Synthetic Query Generation using Large Language Models for Virtual Assistants Apple investigates the use of LLMs to generate synthetic queries for virtual assistants that are similar to real user queries and specific to retrieving relevant entities. 📝arxiv.org/abs/2406.06729

thumb_up_off_alt77

chat_bubble_outline0

repeat15

shareShare

elvis

@omarsar0

2 years ago

Towards Lifelong Learning of LLMs Nice survey on techniques to enable LLMs to learn continuously, integrate new knowledge, retain previously learned information, and prevent catastrophic forgetting. arxiv.org/abs/2406.06391

thumb_up_off_alt347

chat_bubble_outline6

repeat85

shareShare

Bindu Reddy

@bindureddy

2 years ago

Announcing LiveBench AI - The WORLD'S FIRST LLM Benchmark That Can't Be Gamed!! We (Abacus AI) partnered with Yann LeCunn and his team to create LiveBench AI! LiveBench is a living/breathing benchmark with new challenges that you CAN'T simply memorize. Unlike blind human eval,

thumb_up_off_alt914

chat_bubble_outline89

repeat184

shareShare

DeepSeek

@deepseek_ai

2 years ago

DeepSeek-Coder-V2: First Open Source Model Beats GPT4-Turbo in Coding and Math > Excels in coding and math, beating GPT4-Turbo, Claude3-Opus, Gemini-1.5Pro, Codestral. > Supports 338 programming languages and 128K context length. > Fully open-sourced with two sizes: 230B (also

thumb_up_off_alt1,1K

chat_bubble_outline61

repeat334

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

2 years ago

Learning Iterative Reasoning through Energy Diffusion abs: arxiv.org/abs/2406.11179 project page: energy-based-model.github.io/ired/ "IRED learns energy functions to represent the constraints between input conditions and desired outputs. After training, IRED adapts the number of

thumb_up_off_alt208

chat_bubble_outline2

repeat43

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

2 years ago

Transcendence: Generative Models Can Outperform The Experts That Train Them abs: arxiv.org/abs/2406.11741 Uses chess games as a simple testbed for studying transcedence: generative models trained on human labels that outperform humans. Transformer models are trained on public

thumb_up_off_alt304

chat_bubble_outline6

repeat77

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

How Do Large Language Models Acquire Factual Knowledge During Pretraining? Reveals several important insights into the dynamics of factual knowledge acquisition during pretraining arxiv.org/abs/2406.11813

thumb_up_off_alt404

chat_bubble_outline6

repeat75

shareShare

Harrison Chase

@hwchase17

2 years ago

I have lots of thoughts on "agents"! ❓What is an agent? Why do the basic agents not work reliably? How are teams bringing "agentic" applications to production 🙏I had a lot of fun talking about these topics (and more!) for nearly a hour with Sonya/Pat open.spotify.com/episode/786INO…

thumb_up_off_alt251

chat_bubble_outline7

repeat58

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

Google presents What Are the Odds? Language Models Are Capable of Probabilistic Reasoning arxiv.org/abs/2406.12830

thumb_up_off_alt366

chat_bubble_outline4

repeat56

shareShare

elvis

@omarsar0

2 years ago

From RAG to Rich Parameters Investigates more closely how LLMs utilize external knowledge over parametric information for factual queries. Finds that in a RAG pipeline, LLMs take a “shortcut” and display a strong bias towards utilizing only the context information to answer the

thumb_up_off_alt351

chat_bubble_outline6

repeat90

shareShare

Sumit

@_reachsumit

2 years ago

A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges Explores applications of LLMs in various financial tasks, discussing the challenges, opportunities, and resources for further development in this domain. 📝arxiv.org/abs/2406.11903

thumb_up_off_alt20

chat_bubble_outline0

repeat7

shareShare

Rohan Paul

@rohanpaul_ai

2 years ago

Transformer models can learn robust reasoning skills (beyond those of GPT-4-Turbo and Gemini-1.5-Pro) through a stage of training dynamics that continues far beyond the point of overfitting (i.e. with 'Grokking') 🤯 For a challenging reasoning task with a large search space,

thumb_up_off_alt278

chat_bubble_outline8

repeat33

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities Enables MLLMs to express intermediate reasoning as images using code. You probably didn't use typography knowledge to solve this query proj: whiteboard.cs.columbia.edu abs: arxiv.org/abs/2406.14562

thumb_up_off_alt207

chat_bubble_outline3

repeat46

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

2 years ago

This page of common pytorch mistakes is pretty invaluable uvadlc-notebooks.readthedocs.io/en/latest/tuto…

thumb_up_off_alt826

chat_bubble_outline4

repeat106

shareShare

Harrison Chase

@hwchase17

2 years ago

❓What is an agent? I get asked this question a lot, so I wrote a little blog on this topic and other things: - What is an agent? - What does it mean to be agentic? - Why is “agentic” a helpful concept? - Agentic is new Check it out here: blog.langchain.dev/what-is-an-age…

thumb_up_off_alt332

chat_bubble_outline15

repeat42

shareShare

Namgyu Ho

@itsnamgyu

2 years ago

Do you know your LLM uses less than 1% of your GPU at inference? Too much time is wasted on KV cache memory access ➡️ We tackle this with the 🎁 Block Transformer: a global-to-local architecture that speeds up decoding up to 20x 🚀 KAIST AI LG AI Research w/ Google DeepMind 🧵

thumb_up_off_alt628

chat_bubble_outline12

repeat118

shareShare