Salesforce AI Research (@sfresearch) 's Twitter Profile
Salesforce AI Research

@sfresearch

We advance state-of-the-art #AI techniques that pave the path for innovative products at Salesforce. Focus areas include #AgenticAI, #NLP, #TrustedAI.

ID: 2827069807

linkhttps://www.salesforceairesearch.com/ calendar_today22-09-2014 22:26:05

1,1K Tweet

16,16K Followers

283 Following

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

📊 We've developed a novel Benchmarking framework for evaluating enterprise AI assistants across voice and text modalities. Read our complete analysis: sforce.co/44OQE2k ▶️ Our findings reveal: - 5-8% performance drop in voice vs text interactions - Financial workflows

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🧠 RAG systems excel at answering questions—but what happens when there's no answer? We introduce UAEval4RAG, a framework to evaluate how well RAG models handle unanswerable queries. 📄 Paper: bit.ly/3SeY9Ic 🔗 Code: bit.ly/4jejpZw By categorizing

🧠 RAG systems excel at answering questions—but what happens when there's no answer?

We introduce UAEval4RAG, a framework to evaluate how well RAG models handle unanswerable queries. 

📄 Paper: bit.ly/3SeY9Ic
🔗 Code: bit.ly/4jejpZw

By categorizing
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

⚡🧠"Jagged intelligence" perfectly captures why LLMs can pass the bar exam but fail simple riddles. @Silviocinguetta's #EGI framework (Enterprise General Intelligence) focuses on what enterprises actually need: capability + consistency, not just raw intelligence. New piece

⚡🧠"Jagged intelligence" perfectly captures why LLMs can pass the bar exam but fail simple riddles. @Silviocinguetta's #EGI framework (Enterprise General Intelligence) focuses on what enterprises actually need: capability + consistency, not just raw intelligence. New piece
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🚨Introducing "Elastic Reasoning"🚨 Our novel framework solves LLM inference budget constraints without sacrificing performance. Open and available to the research community: 📄 Paper: bit.ly/4kygc8p 💻 Code: bit.ly/3ZwjFfo 🤗 Models:

🚨Introducing "Elastic Reasoning"🚨  

Our novel framework solves LLM inference budget constraints without sacrificing performance. Open and available to the research community:   

📄 Paper: bit.ly/4kygc8p 
💻 Code: bit.ly/3ZwjFfo
🤗 Models:
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🚨NEW MODEL: BLIP3-o 🚨 🔬 Researchers from Salesforce AI Research + UMD Center for Machine Learning introduce BLIP3-o: solving AI's dual challenge of building ONE model that both understands AND generates images at SOTA level. 💡 Key innovation: dual-stage training with frozen autoregressive backbone prevents

🚨NEW MODEL: BLIP3-o 🚨

🔬 Researchers from <a href="/SFResearch/">Salesforce AI Research</a> + <a href="/ml_umd/">UMD Center for Machine Learning</a> introduce BLIP3-o: solving AI's dual challenge of building ONE model that both understands AND generates images at SOTA level.

💡 Key innovation: dual-stage training with frozen autoregressive backbone prevents
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🌊Tried BLIP3-o? Our family of unified multimodal models is making waves, now open-sourced for the AI Research community. 🔓 Github Repo: bit.ly/4muUBzm 🤗 Models: bit.ly/4kB9oXK 🪧 Demo: bit.ly/4jb0YVD 📰 News: bit.ly/3Z1tuC8 ✍️ Blog:

🌊Tried BLIP3-o? Our family of unified multimodal models is making waves, now open-sourced for the AI Research community. 

🔓 Github Repo: bit.ly/4muUBzm 
🤗 Models: bit.ly/4kB9oXK 
🪧 Demo: bit.ly/4jb0YVD 
📰 News: bit.ly/3Z1tuC8
✍️ Blog:
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

⚡ At Salesforce AI Research, we don't just theorize — we DELIVER. What makes us different? Co-creating with customers to turn bold AI visions into market-ready solutions that drive real impact for businesses around the world. Ready to see #EnterpriseAI in action? 👇

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🚨 Introducing CRMArena-Pro: The first multi-turn, enterprise-grade benchmark for LLM agents ✍️Blog: sforce.co/4dKBRIq 🖇️Paper: bit.ly/3T0AY4E 🤗Dataset: bit.ly/4kiRlG3 🖥️Code: bit.ly/4fkrZVM Most AI benchmarks test isolated, single-turn tasks.

🚨 Introducing CRMArena-Pro: The first multi-turn, enterprise-grade benchmark for LLM agents

✍️Blog: sforce.co/4dKBRIq
🖇️Paper: bit.ly/3T0AY4E
🤗Dataset: bit.ly/4kiRlG3
🖥️Code: bit.ly/4fkrZVM

Most AI benchmarks test isolated, single-turn tasks.
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

📣 NEW RESEARCH! 📣 Introducing "Beyond ‘Aha!’: Toward Systematic Meta-Abilities Alignment in Large Reasoning Models" 🖇️ Paper: bit.ly/43RmURb 💻 Code: bit.ly/43IiE5n 🧠 No more waiting for random "Aha!" moments in AI reasoning. Our new research shows how

📣  NEW RESEARCH! 📣 
Introducing "Beyond ‘Aha!’: Toward Systematic Meta-Abilities
Alignment in Large Reasoning Models"

 🖇️ Paper: bit.ly/43RmURb
 💻 Code: bit.ly/43IiE5n

🧠 No more waiting for random "Aha!" moments in AI reasoning. Our new research shows how
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

⚡ NEW COMPUTER-USE AI RESEARCH ⚡ Introducing: 1️⃣ Our paper, Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis 2️⃣OSWORLD-G benchmark covering fine-grained manipulation and layout understanding 3️⃣JEDI dataset, our GUI grounding dataset series with

⚡ NEW COMPUTER-USE AI RESEARCH ⚡

Introducing:
 
1️⃣ Our paper, Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
2️⃣OSWORLD-G benchmark covering fine-grained manipulation and layout understanding 
3️⃣JEDI dataset, our GUI grounding dataset series with
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

CRMArena-Pro reveals why enterprise AI deployment remains challenging—many top-performing agents struggle significantly on real-world business tasks. 👇Full technical breakdown from our research lead Kung-Hsiang Steeve Huang below. #EnterpriseAI #AgenticAI #EGI