Toufique Ahmed (@toufique_ahmed_) 's Twitter Profile
Toufique Ahmed

@toufique_ahmed_

IBM Research; Postdoctoral Scholar & former PhD student at UC Davis; former research intern at Microsoft Research

ID: 1089453869568974848

linkhttps://toufiqueparag.github.io/toufique.github.io/ calendar_today27-01-2019 09:23:38

93 Tweet

280 Followers

201 Following

Maliheh (Mali) Izadi (@malihehizadi) 's Twitter Profile Photo

. Ali Al-Kaswan 🍉 presenting our recent work (w Arie van Deursen பேராசிரியர் Prem Devanbu Toufique Ahmed & @anandsaw) on extending pre-trained models for source code to summarise decompiled binaries, happening now at IEEE SANER ‘23 in Macau. A collaboration between TU Delft and UC Davis. SERG TU Delft

. <a href="/aalkaswan1/">Ali Al-Kaswan 🍉</a> presenting our recent work (w  <a href="/avandeursen/">Arie van Deursen</a> <a href="/devanbu/">பேராசிரியர் Prem Devanbu</a> <a href="/Toufique_Ahmed_/">Toufique Ahmed</a> &amp; @anandsaw) on extending pre-trained models for source code to summarise decompiled binaries, happening now at <a href="/SANERconf/">IEEE SANER</a> ‘23 in Macau.
A collaboration between TU Delft and UC Davis.
<a href="/serg_delft/">SERG TU Delft</a>
பேராசிரியர் Prem Devanbu (@devanbu) 's Twitter Profile Photo

Codex and other LLMs generate a lotta code, for a lotta people. But do they help avoid coding errors? We study this question (with Kevin Jesse, Toufique Ahmed , and Emily Morgan ) in our MSR 2023 paper, using the SStubs4J dataset..we find...

Kevin Jesse (@kjnlp) 's Twitter Profile Photo

Check out our MSR 2023 paper: Large Language Models and Simple, Stupid Bugs! In this work with Toufique Ahmed , பேராசிரியர் Prem Devanbu , and Emily Morgan we explore Codex and other LLMs code completions in the context of bug prone prompts. While Codex avoids some simple, stupid bugs (SStuBs)

Microsoft Research (@msftresearch) 's Twitter Profile Photo

Building reliable hyper-scale cloud services can be challenging. The M365 System Innovation research group advances the understanding of production incidents and uses state-of-the-art AI/ML technologies to help automate cloud operations. Learn more: msft.it/6017gMI7S

Toufique Ahmed (@toufique_ahmed_) 's Twitter Profile Photo

I am pleased to announce that our paper, titled 'Better patching using LLM prompting, via Self-Consistency,' co-authored by பேராசிரியர் Prem Devanbu and myself, has been accepted for presentation at ASE 2024 NIER track. Stay tuned for the the pre-print.

பேராசிரியர் Prem Devanbu (@devanbu) 's Twitter Profile Photo

Happy to summarize a new ASE-NIER paper, with Toufique Ahmed on using “self-consistency” to boost the defect-patching performance of LLMs for Code. 1/ 🧵 arxiv.org/pdf/2306.00108…

பேராசிரியர் Prem Devanbu (@devanbu) 's Twitter Profile Photo

LLM-generated code saves time; but using this code directly risks injecting errors. So what should devs do? Reviewing it all super-carefully is likely too expensive---so… when to review, and when to just accept? Our ICSE 25 paper is on just this:1/ 5 arxiv.org/pdf/2402.02047

பேராசிரியர் Prem Devanbu (@devanbu) 's Twitter Profile Photo

New MSR 2025 paper! SE Research often relies on human subjects to evaluate tool or process innovations. Do they actually help? Questions like “Does this tool generate a good code summary?” “Is this a good, descriptive identifier name?” depend on Human subjective opinion. 1/🧵

Jatin Ganhotra (@jatinganhotra) 's Twitter Profile Photo

Recent SWE agents generate code to resolve issues. While great for productivity, such systems make good tests even more important. To help with that, we present "Otter: Generating Tests from Issues to Validate SWE Patches" & open-source a benchmark, TDD-Bench Verified. 1/5

Niels Mündler (@nielstron) 's Twitter Profile Photo

SWT-Verified ✅ is now released, and Otter establishes a new SOTA 🦦 We have created SWT-Bench Verified, a set of 433 human-verified tasks based on SWE-Bench Verified. "Otter", a new testing agent by IBM Research, establishes a new SOTA here, outperforming @allhands_ai!

SWT-Verified ✅ is now released, and Otter establishes a new SOTA 🦦

We have created SWT-Bench Verified, a set of 433 human-verified tasks based on SWE-Bench Verified. "Otter", a new testing agent by <a href="/IBMResearch/">IBM Research</a>, establishes a new SOTA here, outperforming @allhands_ai!