Shashank Rajput
@shashank_r12
LLM Pretraining @DbrxMosaicAI
ID: 1955982469
https://shashankrajput.github.io/ 12-10-2013 06:44:48
191 Tweet
761 Followers
599 Following
I guess I should probably include some images, since this is an image generation model. I'm so proud of Cory Stephenson, Landan Seguin, Austin Jacobson, jasmine collins, and our extraordinary collaborators at Shutterstock.
today we're announcing our Databricks Mosaic Research x Shutterstock partnership, and a new text-to-image diffusion model: ✨ImageAI!!✨ this model is geared towards enterprise use cases and is trained exclusively on shutterstock's trusted data catalog! databricks.com/company/newsro…
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities by Finetuning on Synthetic Data TLDR: FT'ing on randint key-value retrieval tasks, improves LLM perf on real retrieval tasks arxiv.org/abs/2406.19292 Great project led by Zheyang Xiong& Vasilis Papageorgiou
big shoutout to Nikhil and Jacob Portes for spearheading this work on scaling laws that accounts for inference costs Come say hi to us at ICML :)))
Congratulations Dimitris Papailiopoulos! Looking forward to the amazing research that's going to happen at MSFTResearch now that you'll be there!
At Databricks, we want to help customers build more #inference-friendly #llms. With MixAttention architecture, you can maintain model quality while improving inference speed and reducing memory footprint: databricks.com/blog/mixattent…