Cláudia Mamede (@claudiarmamede) 's Twitter Profile
Cláudia Mamede

@claudiarmamede

SE PhD student at Carnegie Mellon University and University of Porto

ID: 1233180510760919040

calendar_today28-02-2020 00:02:11

5 Tweet

15 Followers

16 Following

Daniel Ramos (@danieltrt7) 's Twitter Profile Photo

🚨 Are Large Language Models Memorizing Bug Benchmarks? 🚨 There’s growing concern that LLMs for SE are prone to data leakage, but no one has quantified it... until now. 🕵️‍♂️ We measured leakage in benchmarks like Defects4J, and SWEBenchLite. arxiv.org/pdf/2411.13323 Findings👇

🚨 Are Large Language Models Memorizing Bug Benchmarks? 🚨 
There’s growing concern that LLMs for SE  are prone to data leakage, but no one has quantified it... until now. 🕵️‍♂️ We measured leakage in benchmarks like Defects4J, and SWEBenchLite.

arxiv.org/pdf/2411.13323 
Findings👇