Shreya Shankar(@sh_reya) 's Twitter Profileg
Shreya Shankar

@sh_reya

I study ML & AI engineers and try to make their lives a little better. PhD-ing in databases & HCI @Berkeley_EECS @UCBEPIC and MLOps-ing around town. She/they.

ID:2286218053

linkhttp://www.sh-reya.com calendar_today11-01-2014 06:46:16

4,1K Tweets

39,4K Followers

594 Following

Follow People
Shreya Shankar(@sh_reya) 's Twitter Profile Photo

i am giving a guest lecture on scaling up vibe checks for your custom LLM pipelines, which is essential to constructing a good fine-tuning dataset!! super stoked for it, and excited to be in the company of the awesome folks lecturing ☺️

sign up here: maven.com/s/course/76981…

account_circle
Ian Arawjo (@ianarawjo@hci.social)(@IanArawjo) 's Twitter Profile Photo

Ethan Mollick Why does everyone see these tools then say 'prompt engineering is dead'? It's just been emphasized even more! And how do we evaluate the prompt is better? 'Automatically generates good prompts' --how do we know it's good? It 'works pretty well' --what tests did you perform?

account_circle
Shreya Shankar(@sh_reya) 's Twitter Profile Photo

Agreed. daniel bashir is a fantastic interviewer. I remember being so impressed by all the preparation that went into my episode. I’ve never seen anything like it

account_circle
Shreya Shankar(@sh_reya) 's Twitter Profile Photo

i hate when i have a UX idea that i think is really good on a sunday night so i rush to implement it & get bogged down with debugging my bad typescript & emerge with a working solution hours later only to find that it was a bad idea & i should have just gone to bed early

account_circle
Ian Arawjo (@ianarawjo@hci.social)(@IanArawjo) 's Twitter Profile Photo

We will talk about our work, “Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences”, at HEAL (well, at least I will be there 😎). Excited to chat with all the amazing people working in this space!

account_circle
Ivan Leo(@ivanleomk) 's Twitter Profile Photo

Elicit Grant Sanderson Ian Goodfellow Vik Paruchuri Shishir Patil 9/ Scaling up Vibe Checks by Shreya Shankar

I think she highlights an important fact that evals are going to be tricky because these models can freely improvise responses. I liked the interface and designs that she mentioned at the back and I think manual evaluation of generated…

account_circle