Do you want SOTA image-text datasets with a simple & transparent dataset construction process? Most datasets start with unfiltered web data and filter it down with complex heuristics. Alex Fang and I have an alternative solution that we are quite excited about!
🧵
#AWS is putting the “art” back into “artificial intelligence” with the new Amazon Titan Image Generator. 🎨✨🖼️ #MachineLearning #AI
Learn more. 🔗 go.aws/3uJsDtb
Check out DataComp for language models! Open data, open code, open training recipe, and close to Llama3-8B performance. This has been a labor of love over the last year, a huge thanks to all the collaborators for helping make this happen!
I've become a bit obsessed with Waymo depots in the last few months. The reason? I believe one of the most important open problems in autonomous driving is real estate acquisition. ⬇️