Toufique Ahmed (@toufique_ahmed_) Twitter Tweets • TwiCopy

Toufique Ahmed

@toufique_ahmed_

+ Follow

IBM Research; Postdoctoral Scholar & former PhD student at UC Davis; former research intern at Microsoft Research

ID: 1089453869568974848

linkhttps://toufiqueparag.github.io/toufique.github.io/ calendar_today27-01-2019 09:23:38

93 Tweet

280 Followers

201 Following

Maliheh (Mali) Izadi

@malihehizadi

3 years ago

. Ali Al-Kaswan 🍉 presenting our recent work (w Arie van Deursen பேராசிரியர் Prem Devanbu Toufique Ahmed & @anandsaw) on extending pre-trained models for source code to summarise decompiled binaries, happening now at IEEE SANER ‘23 in Macau. A collaboration between TU Delft and UC Davis. SERG TU Delft

. <a href="/aalkaswan1/">Ali Al-Kaswan 🍉</a> presenting our recent work (w <a href="/avandeursen/">Arie van Deursen</a> <a href="/devanbu/">பேராசிரியர் Prem Devanbu</a> <a href="/Toufique_Ahmed_/">Toufique Ahmed</a> & @anandsaw) on extending pre-trained models for source code to summarise decompiled binaries, happening now at <a href="/SANERconf/">IEEE SANER</a> ‘23 in Macau.
A collaboration between TU Delft and UC Davis.
<a href="/serg_delft/">SERG TU Delft</a>

thumb_up_off_alt23

chat_bubble_outline0

repeat7

shareShare

Codex and other LLMs generate a lotta code, for a lotta people. But do they help avoid coding errors? We study this question (with Kevin Jesse, Toufique Ahmed , and Emily Morgan ) in our MSR 2023 paper, using the SStubs4J dataset..we find...

thumb_up_off_alt36

chat_bubble_outline1

repeat5

shareShare

Kevin Jesse

@kjnlp

3 years ago

Check out our MSR 2023 paper: Large Language Models and Simple, Stupid Bugs! In this work with Toufique Ahmed , பேராசிரியர் Prem Devanbu , and Emily Morgan we explore Codex and other LLMs code completions in the context of bug prone prompts. While Codex avoids some simple, stupid bugs (SStuBs)

thumb_up_off_alt12

chat_bubble_outline1

repeat3

shareShare

Microsoft Research

@msftresearch

3 years ago

Building reliable hyper-scale cloud services can be challenging. The M365 System Innovation research group advances the understanding of production incidents and uses state-of-the-art AI/ML technologies to help automate cloud operations. Learn more: msft.it/6017gMI7S

thumb_up_off_alt22

chat_bubble_outline1

repeat8

shareShare

Toufique Ahmed

@toufique_ahmed_

2 years ago

I am pleased to announce that our paper, titled 'Better patching using LLM prompting, via Self-Consistency,' co-authored by பேராசிரியர் Prem Devanbu and myself, has been accepted for presentation at ASE 2024 NIER track. Stay tuned for the the pre-print.

thumb_up_off_alt25

chat_bubble_outline1

repeat1

shareShare

பேராசிரியர் Prem Devanbu

@devanbu

2 years ago

Happy to summarize a new ASE-NIER paper, with Toufique Ahmed on using “self-consistency” to boost the defect-patching performance of LLMs for Code. 1/ 🧵 arxiv.org/pdf/2306.00108…

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Toufique Ahmed

@toufique_ahmed_

2 years ago

It is really cool!

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Toufique Ahmed

@toufique_ahmed_

2 years ago

Retweet appreciated!

thumb_up_off_alt19

chat_bubble_outline1

repeat11

shareShare

Matthew Levis

@matthewlevis4

2 years ago

Beyond Boundaries: A Conversation with Premkumar Devanbu, Harlan D. Mills Award Award Recipient bit.ly/3T3i2Dk

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare

பேராசிரியர் Prem Devanbu

@devanbu

a year ago

LLM-generated code saves time; but using this code directly risks injecting errors. So what should devs do? Reviewing it all super-carefully is likely too expensive---so… when to review, and when to just accept? Our ICSE 25 paper is on just this:1/ 5 arxiv.org/pdf/2402.02047

thumb_up_off_alt40

chat_bubble_outline3

repeat6

shareShare

பேராசிரியர் Prem Devanbu

@devanbu

a year ago

New MSR 2025 paper! SE Research often relies on human subjects to evaluate tool or process innovations. Do they actually help? Questions like “Does this tool generate a good code summary?” “Is this a good, descriptive identifier name?” depend on Human subjective opinion. 1/🧵

thumb_up_off_alt34

chat_bubble_outline2

repeat3

shareShare

Jatin Ganhotra

@jatinganhotra

10 months ago

Recent SWE agents generate code to resolve issues. While great for productivity, such systems make good tests even more important. To help with that, we present "Otter: Generating Tests from Issues to Validate SWE Patches" & open-source a benchmark, TDD-Bench Verified. 1/5

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Niels Mündler

@nielstron

9 months ago

SWT-Verified ✅ is now released, and Otter establishes a new SOTA 🦦 We have created SWT-Bench Verified, a set of 433 human-verified tasks based on SWE-Bench Verified. "Otter", a new testing agent by IBM Research, establishes a new SOTA here, outperforming @allhands_ai!

thumb_up_off_alt9

chat_bubble_outline3

repeat3

shareShare