
Moran Mizrahi
@moranmiz
PhD student at @Csehuji (@HyadataLab). Interested in Natural Language Processing, Data Science, Human-Computer Interaction and Computational Creativity.
ID: 56381337
13-07-2009 14:10:55
154 Tweet
309 Followers
264 Following



โจ Ever tried generating an image from a prompt but ended up with unexpected outputs? Check out our new paper #FollowTheFlow - tackling T2I issues like bias, failed binding, and leakage from the textual encoding side! ๐ผ๐ arxiv.org/pdf/2504.01137 guykap12.github.io/guykap12.githuโฆ ๐งต[1/7]


- โI flipped a biased coin with p(Heads) = 0.55.โ - โWhat did it land on?โ What is the probability of the answer being โHeadsโ? Does it depend on whether the outcome is seen? Should we expect it to be 0.55? Check out our new paper! arxiv.org/abs/2505.02072 w/ Omri Abend (1/10)



๐ I'm excited to share that our latest research titled: โToward Reliable Proof Generation with LLMs: Leveraging Analogical Guidance and Symbolic Verificationโ is now available on ArXiv ๐ arxiv.org/pdf/2505.14479 w/ Eitan Stern Hyadata Lab (Dafna Shahaf)




๐จ New paper! We present CHIMERA โ a KB of 28K+ scientific idea recombinations ๐ก It captures how researchers blend concepts or take inspiration across fields, enabling: 1. Meta-science 2. Training models to predict new combos noy-sternlicht.github.io/CHIMERA-Web ๐ Findings & data:

1/5 ๐จ New paper alert! StressTest: Can YOUR Speech LM Handle the Stress? Sentence stress = emphasis on words to signal intent, contrast, or new info. We built StressTest โ a benchmark for testing stress reasoning.๐ฃ๏ธ๐ฌ Then, meet StresSLM who finally gets it! Insights & Links ๐

๐จ New Paper: "Time to Talk"! ๐ต๏ธ We built an LLM agent that doesn't just decide WHAT to say, but also WHEN to say it! Introducing "Time to Talk" - LLM agents for asynchronous group communication, tested in real Mafia games with human players. ๐niveck.github.io/Time-to-Talk ๐งต1/7



ืืืจ ืฉื ืื ืืกืชืืืืช ืืคืจืื ืืจืืื ืคืกืืคืกืื ืงืื ืื ืืฉืืืื ืขื ืงืืจืืช ืื ืืื ืื. ืืชืืื ืืกืืืจ ืขื ื ืื ืืืืืชื ืฉืื ืื ืืืงืืื, ืฉืืฉ ืืื ืืกืชืืจื ืฉืืืจืื ืืืืชืจ ื1500 ืืืื ืืจืืื ืืขืืจ (!), ืืฉืืคืืื ืืฉ ืืคืืืงืฆืื ืฉืืืคืฉืจืช ืืฆืื ืืืชื ืืืฆืืืจ ื ืงืืืืช! ืื ืืฆืืจืฃ ืืืจืืฃ ืืืืืจืื? ๐พ๐ฝ๐ธ NadavBas ๐ช๐บ๐ซ๐ท ๐๏ธ Efrat Frid


Old news: Single-prompt eval is unreliable๐คฏ New news: PromptSuite๐ - an easy way to augment your benchmark with thousands of paraphrases โก๏ธ robust eval, zero sweat! - Works on any dataset! - Python API + web UI Eliya Habba, Gili Lior, Gabriel Stanovsky eliyahabba.github.io/PromptSuite/

๐ Proud to share that "Debatable Intelligence" has now been accepted to #EMNLP2025 (Main Conference)! noy-sternlicht.github.io/Debatable-Inteโฆ Huge thenks to my amazing collaborators Ariel Gera, Roy Bar Haim, Tom Hope, Noam Slonim ๐ข
