Ethan Mollick(@emollick) 's Twitter Profileg
Ethan Mollick

@emollick

Professor @Wharton studying AI, innovation & startups. Democratizing education using tech
Book: https://t.co/CSmipbJ2jV
Substack: https://t.co/UIBhxu4bgq

ID:39125788

linkhttps://mgmt.wharton.upenn.edu/profile/emollick/ calendar_today10-05-2009 22:33:52

26,4K Tweets

209,8K Followers

551 Following

Ethan Mollick(@emollick) 's Twitter Profile Photo

One of the best ways to improve LLM performance is to ask it to “think aloud” (there are various techniques for doing this, including Chain of Thought). This also helps establish clearly the AIs plans.

This paper suggests that, in some cases, the AI can plan without revealing it

account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

Been experimenting with Devika, the open source effort to build an AI agent like Devin.

It is a very interesting start, but not close to Devin, yet. It struggles with executing on plans (the most critical feature for an AI agent). But I suspect that it will improve over time.

Been experimenting with Devika, the open source effort to build an AI agent like Devin. It is a very interesting start, but not close to Devin, yet. It struggles with executing on plans (the most critical feature for an AI agent). But I suspect that it will improve over time.
account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

I think too many people think only of AI use in places where errors are not tolerated. It isn't good at that. LLMs hallucinate & make mistakes.

But for a huge amount of work, human error is tolerated, and the question is whether AIs (working with humans) make more or less…

account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

In the executive MBA class I taught today, a student said something very useful to understand about AI:

They said you have to approach working with GPT-4 as a manager, and if doesn't do something right, you need to provide more direction, rules, or instructions. That often helps

account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

Your R&D team for using AI is your employees, they can better experiment and judge the results of AI co-intelligence than anyone in a central research organizations.

Paying for frontier AI access for employees, and giving people time and incentives to experiment, is an R&D cost.

account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

So much debate over the utility of AI is based on vibes alone, most doesn't engage with the increasing number of controlled experiments that show strong real world abilities of AI across a variety of fields.

Implementation takes time, of course, but this isn't just speculation.

So much debate over the utility of AI is based on vibes alone, most doesn't engage with the increasing number of controlled experiments that show strong real world abilities of AI across a variety of fields. Implementation takes time, of course, but this isn't just speculation.
account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

Fascinating thread in two ways. The first is the paper and findings itself, explaining how LLMs actually deal with context windows. The second is the fact that this is a “discovery:” LLMs have unexpectedly developed retrieval heads, they were not explicitly coded for by creators

account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

I see “AI won’t take your job, someone using AI will take your job.“ don’t like this frame because:
1) “Your job”’ is more likely to transform over time than be “taken”
2) Some jobs will really disappear with AI
3) Using AI is going to get easier, it is isn’t a secret priesthood

account_circle
Ethan Mollick(@emollick) 's Twitter Profile Photo

Field experiments are great but they can go wrong in so many ways because the real world is messy

I appreciate this honest write-up of the difficulties of measuring the impact of Copilot on coding performance. It looks like it lifted productivity, though. mit-genai.pubpub.org/pub/v5iixksv/r…

Field experiments are great but they can go wrong in so many ways because the real world is messy I appreciate this honest write-up of the difficulties of measuring the impact of Copilot on coding performance. It looks like it lifted productivity, though. mit-genai.pubpub.org/pub/v5iixksv/r…
account_circle