Atakan Tekparmak (@atakantekparmak) 's Twitter Profile
Atakan Tekparmak

@atakantekparmak

Bsc Artifical Intelligence Graduate, University of Groningen. I keep (re)posting daily AI news, papers and threads (with a focus on LLMs).

ID: 1345374773916921856

linkhttps://atakantekparmak.github.io/ calendar_today02-01-2021 14:22:05

1,1K Tweet

497 Takipçi

578 Takip Edilen

Yu Su @ACL (@ysu_nlp) 's Twitter Profile Photo

People into agents, let me pitch something to you: 🌟 An agent that works across every platform (web, desktop & mobile) 🌟 Visual perception only, no messy & often incomplete HTML or a11y tree 🌟 SOTA performance across 6 agent benchmarks Sounds too good to be true? Continue ⬇️

People into agents, let me pitch something to you:

🌟 An agent that works across every platform (web, desktop & mobile)
🌟 Visual perception only, no messy & often incomplete HTML or a11y tree
🌟 SOTA performance across 6 agent benchmarks

Sounds too good to be true? Continue ⬇️
Pliny the Liberator 🐉 (@elder_plinius) 's Twitter Profile Photo

AI RED-TEAMING PLINY-AGENTS HAVE ARRIVED! 🦾 Here's the video of liberated Claude autonomously jailbreaking Perplexity to produce a meth synthesis recipe! And it was all done from a SINGLE PROMPT in less than 10 minutes 😻 What a time to be alive!

Maziyar PANAHI (@maziyarpanahi) 's Twitter Profile Photo

The base model for Pixtral just dropped on Hugging Face! 🔥 And here’s the big news: it’s licensed under Apache 2.0! 🚀 huggingface.co/mistralai/Pixt…

Yuling Gu (@gu_yuling) 's Twitter Profile Photo

⚠️ Introducing SimpleToM, exposing a jarring gap in the Theory-of-Mind capabilities of current frontier LLMs: 😲 They fail to implicitly apply mental state inferences, even when they can easily infer these states for two-sentence stories. 😲 📜 arxiv.org/abs/2410.13648 1/

⚠️ Introducing SimpleToM, exposing a jarring gap in the Theory-of-Mind capabilities of current frontier LLMs:
😲 They fail to implicitly apply mental state inferences, even when they can easily infer these states for two-sentence stories. 😲
📜 arxiv.org/abs/2410.13648
1/
Niels Rogge (@nielsrogge) 's Twitter Profile Photo

A new video LLM by Meta dropped on the hub, and it's the new SOTA for open-source video understanding > builds on top of SigLIP/DINOv2 and Qwen2/Llama 3.2 > includes a 3B parameter model for on-device use cases Weights: huggingface.co/collections/Vi… Demo: huggingface.co/spaces/Vision-…

A new video LLM by <a href="/Meta/">Meta</a> dropped on the hub, and it's the new SOTA for open-source video understanding

&gt; builds on top of SigLIP/DINOv2 and Qwen2/Llama 3.2
&gt; includes a 3B parameter model for on-device use cases

Weights: huggingface.co/collections/Vi…
Demo: huggingface.co/spaces/Vision-…
Atakan Tekparmak (@atakantekparmak) 's Twitter Profile Photo

Gemini-1.5-flash is a gift from Google. What else has 1500 free requests per day for you to try it and is this good for the (rumoured) size and price?

Varun (@varun_mathur) 's Twitter Profile Photo

This is a game-changer announcement by Apple around cryptography. It is the “HTTPS moment for AI” in some ways.. Here is what this means: your private confidential data can be pooled with other data sources and used to securely improve your UX and that of the wider community

This is a game-changer announcement by Apple around cryptography. It is the “HTTPS moment for AI” in some ways.. 

Here is what this means: your private confidential data can be pooled with other data sources and used to securely improve your UX and that of the wider community
Alexander Doria (@dorialexander) 's Twitter Profile Photo

Releasing my detailed commented introduction to LLM sampling colab.research.google.com/drive/18-2Z4TM… We get back to the basics and slowly build up to a reproduction of the adaptive temperature strategy from "Softmax is not enough" (from Petar Veličković et al.)

Releasing my detailed commented introduction to LLM sampling colab.research.google.com/drive/18-2Z4TM… We get back to the basics and slowly build up to a reproduction of the adaptive temperature strategy from "Softmax is not enough" (from <a href="/PetarV_93/">Petar Veličković</a> et al.)
Marcel Binz (@marcel_binz) 's Twitter Profile Photo

Excited to announce Centaur -- the first foundation model of human cognition. Centaur can predict and simulate human behavior in any experiment expressible in natural language. You can readily download the model from Hugging Face and test it yourself: huggingface.co/marcelbinz/Lla…

TuringPost (@theturingpost) 's Twitter Profile Photo

.Google DeepMind, Google AI and KAIST AI introduce new methods to turn large LLMs into smaller models: - Recursive Transformers that reuse layers multiple times - Relaxed Recursive Transformers with LoRA - Continuous Depth-wise Batching for speeding up processing Details 🧵

.<a href="/GoogleDeepMind/">Google DeepMind</a>, <a href="/GoogleAI/">Google AI</a> and <a href="/kaist_ai/">KAIST AI</a> introduce new methods to turn large LLMs into smaller models:

- Recursive Transformers that reuse layers multiple times
- Relaxed Recursive Transformers with LoRA
- Continuous Depth-wise Batching for speeding up processing

Details 🧵
jack morris (@jxmnop) 's Twitter Profile Photo

just open-sourced the training and evaluation code for cde, our state-of-the-art small text embedding model includes code for lots of hard stuff: * efficient clustering large datasets * contrastive training for SOTA retrieval models * our custom two-stage model architecture that

just open-sourced the training and evaluation code for cde, our state-of-the-art small text embedding model

includes code for lots of hard stuff:
* efficient clustering large datasets
* contrastive training for SOTA retrieval models
* our custom two-stage model architecture that
MetaGPT (@metagpt_) 's Twitter Profile Photo

🌟 Excited to open-source SELA, a powerful experimentation system integrating MCTS with LLM agents. Across 20 datasets, SELA achieves a 75% win rate against AIDE (OpenAI's top pick in MLE-Bench) and beats traditional AutoML methods developed over years. 💻 Code:

🌟 Excited to open-source SELA, a powerful experimentation system integrating MCTS with LLM agents. Across 20 datasets, SELA achieves a 75% win rate against AIDE (OpenAI's top pick in MLE-Bench) and beats traditional AutoML methods developed over years.
💻 Code:
Vaibhav (VB) Srivastav (@reach_vb) 's Twitter Profile Photo

🚨Meta released MobileLLM - 125M, 350M, 600M, 1B model checkpoints! 🔥 Notes on the release: Depth vs. Width: Contrary to the scaling law (Kaplan et al., 2020), depth is more critical than width for small LLMs, enhancing abstract concept capture and final performance Embedding

🚨Meta released MobileLLM - 125M, 350M, 600M, 1B model checkpoints! 🔥

Notes on the release:

Depth vs. Width: Contrary to the scaling law (Kaplan et al., 2020), depth is more critical than width for small LLMs, enhancing abstract concept capture and final performance

Embedding
Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Say hello to Grounding with Google Search, available in the Gemini API + Google AI Studio! You can now access real time, fresh, up to date information from Google Search when building with Gemini by enabling the Grounding tool. developers.googleblog.com/en/gemini-api-…

Vaibhav (VB) Srivastav (@reach_vb) 's Twitter Profile Photo

Fuck it - it’s raining smol LMs - SmolLM2 1.7B - beats Qwen 2.5 1.5B & Llama 3.21B, Apache 2.0 licensed, trained on 11 Trillion tokens 🔥 > 135M, 360M, 1.7B parameter model > Trained on FineWeb-Edu, DCLM, The Stack, along w/ new mathematics and coding datasets > Specialises in

Fuck it - it’s raining smol LMs - SmolLM2 1.7B - beats Qwen 2.5 1.5B &amp; Llama 3.21B, Apache 2.0 licensed, trained on 11 Trillion tokens 🔥

&gt; 135M, 360M, 1.7B parameter model
&gt; Trained on FineWeb-Edu, DCLM, The Stack, along w/ new mathematics and coding datasets
&gt; Specialises in