anton(@abacaj) 's Twitter Profileg
anton

@abacaj

Software engineer. Hacking on large language models

ID:70514287

calendar_today31-08-2009 22:06:04

10,8K Tweets

36,1K Followers

518 Following

AK(@_akhaliq) 's Twitter Profile Photo

From Words to Numbers

Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples,

From Words to Numbers Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples,
account_circle
anton(@abacaj) 's Twitter Profile Photo

one thing I realized (which wasn't so obvious to me) is that there are plenty of people who don't really want to prompt models like gpt-4/claude, even though the models are usable on their own (without a wrapper)

people would rather have a guided workflow (questions) that then…

account_circle
Aran Komatsuzaki(@arankomatsuzaki) 's Twitter Profile Photo

Google presents Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

1B model that was fine-tuned on up to 5K sequence length passkey instances solves the 1M length problem

arxiv.org/abs/2404.07143

Google presents Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention 1B model that was fine-tuned on up to 5K sequence length passkey instances solves the 1M length problem arxiv.org/abs/2404.07143
account_circle
anton(@abacaj) 's Twitter Profile Photo

anyone have success with really long context (gemini 1M tokens)? I think just throwing extra context without really improving model performance isn't very useful. LLMs are known to be thrown off by more text, the more context you have the higher chance you'll put irrelevant…

account_circle
Bob(@futuristfrog) 's Twitter Profile Photo

Here is how I solved Taelin's A::B Challenge for 10k

twitter.com/VictorTaelin/s…

1. I referenced kenshin9000's prompt as a starting point
platform.openai.com/playground/p/O…

2. First thing I tried was swapping the # for Tags like A# to <A so it can form tags like <A A> (Hopefully…

account_circle