Fanjia Yan (@fanjia_yan) 's Twitter Profile
Fanjia Yan

@fanjia_yan

EECS @cal

ID: 1520825657541812224

calendar_today01-05-2022 18:01:16

11 Tweet

55 Followers

81 Following

Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

📢 Introducing Gorilla OpenFunctions! 🔥 We've listened to your calls for an open-source function calling model, and are thrilled to present Gorilla OpenFunctions 🦍 And yes, we've made parallel functions a reality in open-source! 😎 Curious about typical scenarios where GPT-4

📢 Introducing Gorilla OpenFunctions! 🔥 We've listened to your calls for an open-source function calling model, and are thrilled to present Gorilla OpenFunctions 🦍 And yes, we've made parallel functions a reality in open-source! 😎

Curious about typical scenarios where GPT-4
Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

📢Excited to release the live Berkeley Function-Calling Leaderboard! 🔥 Also debuting openfunctions-v2 🤩 the latest open-source SoTA function-calling model on-par with GPT-4🆕Native support for Javascript, Java, REST! 🫡 Leaderboard: gorilla.cs.berkeley.edu/leaderboard.ht… Blog:

📢Excited to release the live Berkeley Function-Calling Leaderboard! 🔥 Also debuting openfunctions-v2 🤩 the latest open-source SoTA function-calling model on-par with GPT-4🆕Native support for Javascript, Java, REST! 🫡
Leaderboard: gorilla.cs.berkeley.edu/leaderboard.ht…
Blog:
Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

In today's updates to the Berkeley Function Calling leaderboard: 📊Enhanced Leaderboard with Additional Models and Summary Table: Mistral AI-large-2402, Google AI Gemini 1.0 Pro, and Gemma now included. 🤖 Gradio for Interactive Exploration! Includes function calling demos, and

Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

🌀Check out RAFT: Retrieval-Aware Fine Tuning! A simple technique to prepare data for fine-tuning LLMs for in-domain RAG, i.e., question-answering on your set of documents 📄 Exciting collaboration with Berkeley AI Research 🤝 Microsoft Azure 🤝 AI at Meta MSFT-Meta blog:

Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

📢Excited to release GoEx⚡️a runtime for LLM-generated actions like code, API calls, and more. Featuring "post-facto validation" for assessing LLM actions after execution 🔍 Key to our approach is "undo" 🔄 and "damage confinement" abstractions to manage unintended actions &

Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

📊Delighted to welcome Command-R-Plus, Llama-3, and and Gemini-Pro-1.5 into the Berkeley Function Calling Leaderboard. Check out how they stack up across different categories, P95 latency, and costs at gorilla.cs.berkeley.edu/leaderboard.ht… Congratulations to Cohere, AI at Meta, and

📊Delighted to welcome Command-R-Plus, Llama-3, and and Gemini-Pro-1.5 into the Berkeley Function Calling Leaderboard. Check out how they stack up across different categories, P95 latency, and costs at gorilla.cs.berkeley.edu/leaderboard.ht…

Congratulations to <a href="/cohere/">Cohere</a>, <a href="/AIatMeta/">AI at Meta</a>, and
Manish Shetty (@slimshetty_) 's Twitter Profile Photo

Want to turn your own GitHub Repos into a playground for 🤖 coding agents? 📢📢 Introducing R2E: Repository to Environment 📈 Scalable, dynamic, real-world repo-level benchmarks 💡 Generate Equivalence Tests Harnesses 🔗 r2e.dev | Accepted @ ICML '24 🧵

Want to turn your own GitHub Repos into a playground for 🤖 coding agents?

📢📢 Introducing R2E: Repository to Environment

📈 Scalable, dynamic, real-world repo-level benchmarks
💡 Generate Equivalence Tests Harnesses
🔗 r2e.dev | Accepted @ ICML '24

🧵
Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

📢Berkeley Function Calling Leaderboard Update: Discover the enhanced performance and cost-efficiency of Google DeepMind's Gemini-1.5-pro and Gemini-1.5-flash, alongside OpenAI's new gpt-4o models ⚡️Gemini sets a new benchmark in function-calling 🏆and improves its

📢Berkeley Function Calling Leaderboard Update: Discover the  enhanced performance and cost-efficiency of <a href="/GoogleDeepMind/">Google DeepMind</a>'s  Gemini-1.5-pro and Gemini-1.5-flash, alongside <a href="/OpenAI/">OpenAI</a>'s new gpt-4o  models ⚡️Gemini sets a new benchmark in function-calling 🏆and improves its
Shishir Patil (@shishirpatil_) 's Twitter Profile Photo

📣 Announcing BFCL V3 - evaluating how LLMs handle multi-turn, and multi-step function calling! 🚀 For agentic systems, function calling is critical, but a model needs to do more than single-turn tasks. Can it manage multi-turn workflows, handle sequential functions, and adapt to

📣 Announcing BFCL V3 - evaluating how LLMs handle multi-turn, and multi-step function calling! 🚀
For agentic systems, function calling is critical, but a model needs to do more than single-turn tasks. Can it manage multi-turn workflows, handle sequential functions, and adapt to
Charles Packer (@charlespacker) 's Twitter Profile Photo

Excited to finally announce Letta ! The next frontier in AI is in the stateful layer above the base models - the "memory layer", or "LLM OS". Letta's mission is to build this layer in the open (say "no" 🙅 to privatized chain of thought).

Excited to finally announce <a href="/Letta_AI/">Letta</a> !

The next frontier in AI is in the stateful layer above the base models - the "memory layer", or "LLM OS".

Letta's mission is to build this layer in the open (say "no" 🙅 to privatized chain of thought).
Changran Hu (@changran_hu) 's Twitter Profile Photo

🚀Excited to announce the best open-source tokenizer-free language model! EvaByte, our 6.5B byte-level LM developed by The University of Hong Kong SambaNova , matches modern tokenizer-based LMs AI at Meta Google DeepMind Ai2 Apple with 5x data efficiency & 2x faster decoding!

🚀Excited to announce the best open-source tokenizer-free language model!  EvaByte, our 6.5B byte-level LM developed by <a href="/HKUniversity/">The University of Hong Kong</a> <a href="/SambaNovaAI/">SambaNova</a> , matches modern tokenizer-based LMs <a href="/AIatMeta/">AI at Meta</a> <a href="/GoogleDeepMind/">Google DeepMind</a> <a href="/allen_ai/">Ai2</a> <a href="/Apple/">Apple</a>  with 5x data efficiency &amp; 2x faster decoding!
Wenhao Chai (@wenhaocha1) 's Twitter Profile Photo

LiveCodeBench Pro remains one of the most challenging code benchmarks, but its evaluation and verification process is still a black box. We introduce AutoCode, which democratizes evaluation allowing anyone to locally run verification and perform RL training! For the first time,

LiveCodeBench Pro remains one of the most challenging code benchmarks, but its evaluation and verification process is still a black box.
We introduce AutoCode, which democratizes evaluation allowing anyone to locally run verification and perform RL training!
For the first time,
Amazon News (@amazonnews) 's Twitter Profile Photo

Amazon's Nova 2 models are here: ➡️ Lite: Fast and cost-effective reasoning for everyday tasks ➡️ Pro: For highly complex tasks like agentic coding, long-range planning, and sophisticated problem-solving – where the highest accuracy is essential ➡️ Sonic: Expanded multilingual