berk atıl (@berkatilgs) 's Twitter Profile
berk atıl

@berkatilgs

PhD student at Penn State University.
NLP Researcher

ID: 169622571

calendar_today22-07-2010 19:47:00

181 Tweet

105 Followers

481 Following

PSU NLP LAB (@nlp_pennstate) 's Twitter Profile Photo

Congratulations to Dr. Zhaohui Li (ZhaohuiLee) for completing his dissertation with flying colors and winning the "Best Doctoral Dissertation" award. His impactful works on AI in education are already having real-world impacts. Wishing you all the best for the future!!🌟

Congratulations to Dr. Zhaohui Li (<a href="/nkzhaohuilee/">ZhaohuiLee</a>) for completing his dissertation with flying colors and winning the "Best Doctoral Dissertation" award.

His impactful works on AI in education are already having real-world impacts. Wishing you all the best for the future!!🌟
Ryo Kamoi (@ryokamoi) 's Twitter Profile Photo

📢 New survey on Self-Correction of LLMs! 😢 LLMs often cannot correct their mistakes by prompting themselves 😢 Many studies conduct unfair experiments 😃 We analyze requirements for self-correction🧵 Yusen Zhang✈️@COLM’24 Nan Zhang Jiawei Han Rui Zhang arxiv.org/abs/2406.01297

📢 New survey on Self-Correction of LLMs!
😢 LLMs often cannot correct their mistakes by prompting themselves
😢 Many studies conduct unfair experiments
😃 We analyze requirements for self-correction🧵
<a href="/YusenZhangNLP/">Yusen Zhang✈️@COLM’24</a> <a href="/NanZhangNLP/">Nan Zhang</a> Jiawei Han <a href="/ruizhang_nlp/">Rui Zhang</a>
arxiv.org/abs/2406.01297
Rui Zhang (@ruizhang_nlp) 's Twitter Profile Photo

We present Chain of Agents, a novel framework that harnesses multi-agent collaboration through natural language to enable information aggregation and context reasoning across various LLMs over long-context tasks! Led by Yusen Zhang✈️@COLM’24 from his internship at Google

Nan Zhang (@nanzhangnlp) 's Twitter Profile Photo

📢Training efficient LLMs toward a knowledge-intensive domain? Consider domain-specific and task-agnostic compression! Code: github.com/psunlpgroup/D-… Paper: arxiv.org/abs/2405.06275 Excited to present our #NAACL findings paper for LLMs pruning during Virtual Poster Session 2!🧵

Rui Zhang (@ruizhang_nlp) 's Twitter Profile Photo

⚡️We present D-Pruner for LLM pruning to create domain-specific, task-agnostic LLMs by jointly identifying LLM weights that are pivotal for general capabilities and domain-specific knowledge! NAACL 2024 Findings led by Nan Zhang

Vipul Gupta (@vipul_1011) 's Twitter Profile Photo

Finally in 5th (maybe 6th?) attempt over past 16 months, my paper has been accepted! It's been a marathon, but now I can peacefully focus on other projects. *sigh of relief*

Wenpeng_Yin (@wenpeng_yin) 's Twitter Profile Photo

Our latest work (link: arxiv.org/pdf/2406.16203) is online. We found that LLMs' seemingly super-human performance is due to inappropriate evaluation. For example, when the gold label is unavailable, LLMs still choose from incorrect labels. We define and benchmark this new problem.

Our latest work (link: arxiv.org/pdf/2406.16203) is online. We found that LLMs' seemingly super-human performance is due to inappropriate evaluation. For example, when the gold label is unavailable, LLMs still choose from incorrect labels. We define and benchmark this new problem.
Vipul Gupta (@vipul_1011) 's Twitter Profile Photo

🚨There is serious lack of robustness with MMLU! In our new work we find that “Changing Answer Order Can Decrease MMLU Accuracy” and the accuracy of top models can drop by 10-20%📉 This means leaderboards might not be as reliable as we thought! 📄arxiv.org/abs/2406.19470 (1/N)

🚨There is serious lack of robustness with MMLU!

In our new work we find that “Changing Answer Order Can Decrease MMLU Accuracy” and the accuracy of top models can drop by 10-20%📉
This means leaderboards might not be as reliable as we thought!
📄arxiv.org/abs/2406.19470

(1/N)
Vipul Gupta (@vipul_1011) 's Twitter Profile Photo

Happy to share that this work got accepted at Conference on Language Modeling! We introduce a new dataset and methodology to have robust and reliable measurement of biases in LMs. Kudos to the organizers for constructive reviews and the rebuttal phase. See you all in Philly!

Wenpeng_Yin (@wenpeng_yin) 's Twitter Profile Photo

👏How to reduce the uncertainty of LLM generation? Beyond Self-Consistency, we can ask i) Direct prompt, "Which one is correct?" and ii) Inverse prompt, "Which one is incorrect?". Humans are consistent with them, how about LLMs? joint work w/ Lu Cheng Rui Zhang

👏How to reduce the uncertainty of LLM generation? Beyond Self-Consistency, we can ask i) Direct prompt,  "Which one is correct?" and ii) Inverse prompt,  "Which one is incorrect?". Humans are consistent with them, how about LLMs? joint work w/ <a href="/luchengSRAI/">Lu Cheng</a> <a href="/ruizhang_nlp/">Rui Zhang</a>
Rıza Özçelik (@rza_ozcelik) 's Twitter Profile Photo

🥳🥳I'm thrilled that our work that introduces the state space models (SSMs) into de novo design is now published at Nature Communications 🎉 🎉 nature.com/articles/s4146… w/ Sarah de Ruiter, Emanuele Criscuolo, and Francesca Grisoni Molecular Machine Learning