four (@four) 's Twitter Profile
four

@four

ID: 1331131

calendar_today17-03-2007 03:04:50

61 Tweet

1,1K Takipçi

175 Takip Edilen

SpaceX (@spacex) 's Twitter Profile Photo

Starship and Super Heavy are ready at the launch pad in Starbase, Texas. Targeting Saturday, November 18 for Starship’s second integrated flight test → spacex.com/launches

Starship and Super Heavy are ready at the launch pad in Starbase, Texas. Targeting Saturday, November 18 for Starship’s second integrated flight test → spacex.com/launches
Yi Zeng 曾祎 (@easonzeng623) 's Twitter Profile Photo

🚨 [New Paper] If you're involved in AI safety or jailbreaking, you don't want to miss this: Techniques from human communication now effectively breach aligned LLMs (Llama-2 Chat, GPT-3.5, GPT-4) with over 92% attack success rate. 👇🧵(1/7 - page link: chats-lab.github.io/persuasive_jai…)

Dino A. Dai Zovi (@dinodaizovi) 's Twitter Profile Photo

The number one reason why good security is hard is that the feedback loop on decisions is long and the signal is low fidelity. It's not clear how many incidents were prevented or mitigated from which foundational decisions years prior. This wrecks the incentives to be proactive.

SpaceX (@spacex) 's Twitter Profile Photo

Starship completed its rehearsal for launch, loading more than 10 million pounds of propellant on Starship and Super Heavy and taking the flight-like countdown to T-10 seconds

Starship completed its rehearsal for launch, loading more than 10 million pounds of propellant on Starship and Super Heavy and taking the flight-like countdown to T-10 seconds
will depue (in singapore for ICLR) (@willdepue) 's Twitter Profile Photo

announcing... starlinkmap dot org real-time map of every starlink satellite. tracks upcoming launches, other constellations, orbital updates, etc. finally launching this after a while! more details below.

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Congrats Google DeepMind on the new Gemma-2 27B & 9B release! Gemma-2 was tested in the Arena under the codename "*late-june-chatbots" and now out of stealth. Its early result matches the best open models (Llama-3-70B, Nemotron-340B) with only 27B parameters! Impressively,

Congrats <a href="/GoogleDeepMind/">Google DeepMind</a> on the new Gemma-2 27B &amp; 9B release!

Gemma-2 was tested in the Arena under the codename "*late-june-chatbots" and now out of stealth. Its early result matches the best open models (Llama-3-70B, Nemotron-340B) with only 27B parameters!

Impressively,
Demis Hassabis (@demishassabis) 's Twitter Profile Photo

Advanced mathematical reasoning is a critical capability for modern AI. Today we announce a major milestone in a longstanding grand challenge: our hybrid AI system attained the equivalent of a silver medal at this year’s International Math Olympiad!

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Exciting News from Chatbot Arena! Google DeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive

Exciting News from Chatbot Arena!

<a href="/GoogleDeepMind/">Google DeepMind</a>'s new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes.

For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive
Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Gemini-exp-1206, our latest Gemini iteration, (with the full 2M token context and much more) is available right now for free in Google AI Studio and the Gemini API. I hope you have enjoyed year 1 of the Gemini era as much as I have. We are just getting started : )

Jeff Dean (@jeffdean) 's Twitter Profile Photo

What a way to celebrate one year of incredible Gemini progress -- #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on. Thanks to the hard work of everyone in the Gemini team and

What a way to celebrate one year of incredible Gemini progress -- #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on.

Thanks to the hard work of everyone in the Gemini team and
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

As we make progress towards AGI, developing AI needs to be both innovative and safe. ⚖️ To help ensure this, we’ve made updates to our Frontier Safety Framework - our set of protocols to help us stay ahead of possible severe risks. Find out more → goo.gle/42IuIVf

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Introducing Gemini 2.5 Pro, the world's most powerful model, with unified reasoning capabilities + all the things you love about Gemini (long context, tools, etc) Available as experimental and for free right now in Google AI Studio + API, with pricing coming very soon!

Introducing Gemini 2.5 Pro, the world's most powerful model, with unified reasoning capabilities + all the things you love about Gemini (long context, tools, etc)

Available as experimental and for free right now in Google AI Studio + API, with pricing coming very soon!
Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Deep Research in the Gemini App is now powered by Gemini 2.5 Pro, and our early tests show users prefer this 2:1 vs “other products” ;) gemini.google.com

Anca Dragan (@ancadianadragan) 's Twitter Profile Photo

Per our Frontier Safety Framework, we continue to test our models for critical capabilities. Here’s the updated model card for Gemini 2.5Pro with frontier safety evaluations + explanation of how our safety buffer / alert thresholds approach applies to 2.0, 2.5, and what’s coming.