The Full Stack (@full_stack_dl) 's Twitter Profile
The Full Stack

@full_stack_dl

News, community, and courses for people building AI-powered products.

ID: 1085303366253633536

linkhttps://fullstackdeeplearning.com calendar_today15-01-2019 22:31:01

1,1K Tweet

22,22K Followers

177 Following

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

By what year will there be an AI that is more capable than most humans in most domains of digital work (e.g. you can tell it to do anything you currently hire a white collar professional to do, and it does the job better than the median human)?

The Full Stack (@full_stack_dl) 's Twitter Profile Photo

This is sadly true! If you want the latest version, come join us in November for our in-person workshop with AI by the Bay scale.bythebay.io/llm-workshop

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Is there a service I can use to pipe my GPT-4 calls through, and it automatically finetunes GPT-3.5 (or whatever) on all of them, and lets me know when it's up to par?

AI by the Bay (@scalebythebay) 's Twitter Profile Photo

We bring in The Full Stack, a venerable boot camp crew that pioneered technical deep dives into deep learning where people fly in from around the world. 🥞 Their #LLM Bootcamp in the spring was sold out and this is your chance to attend the ➡️ version. 👉 scale.bythebay.io/register

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Let's say that a US-based research company has developed an AGI model that was able to use the browser, pass captchas, hire people on Upwork, and lie about its intentions. What should they do after observing this?

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Has anyone had good experiences with GPT-powered code generation for complete web app features? As in, you describe what should exist, and GPT actually provides the source of all the necessary files and where they should go. Ideally in the context of Ruby on Rails.

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Which set of statements do you agree with? 1. AGI is as much or more of a risk to human flourishing as nuclear weapons 2. I have a good idea for what should be done about that

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Has anyone done comprehensive testing of gpt-4-vision-preview? I want to know stuff like the minimum text size it can read, the radius of the smallest circle it can locate in an image, the number of circles it can count, etc. Could be an automated benchmark for other models too

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

What percentage of your Twitter feed (the stuff you actually read, not just scroll past) do you believe is currently written by AI?

Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Several agents plus three simple baselines were tested on HumanEval. Agents were mostly worse and always more expensive than the baselines. The good: · Evaluating the Pareto frontier · Strong simple baselines (just repeated calls!) The bad: · Clearly saturating the benchmark

Several agents plus three simple baselines were tested on HumanEval.

Agents were mostly worse and always more expensive than the baselines.

The good:
· Evaluating the Pareto frontier
· Strong simple baselines (just repeated calls!)

The bad:
· Clearly saturating the benchmark
Sergey Karayev (@sergeykarayev) 's Twitter Profile Photo

Is Claude Code still the best coding agent on the market? You can now easily find out by launching Claude, Codex, Gemini, and Amp on every ticket in your codebase: