Bo Pang (@bo_pang0) 's Twitter Profile
Bo Pang

@bo_pang0

Representation Learning, Generative Modeling and NLP. Research Scientist @SFResearch. Ph.D. from @UCLA.

ID: 1046308657258934272

calendar_today30-09-2018 07:59:58

15 Tweet

105 Takipçi

257 Takip Edilen

Russ Salakhutdinov (@rsalakhu) 's Twitter Profile Photo

#nips2018 paper: GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations: Learning generic latent relational graphs between words, pixels from unlabeled data & transferring the graphs to downstream tasks: arxiv.org/abs/1806.05662 w/t Z. Yang, J. Zhao et al.

#nips2018 paper: GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations: Learning generic latent relational graphs between words, pixels from unlabeled data & transferring the graphs to downstream tasks: arxiv.org/abs/1806.05662
w/t Z. Yang, J. Zhao et al.
Danilo J. Rezende (@danilojrezende) 's Twitter Profile Photo

Taming VAEs: A theoretical analysis of their properties and behaviour in the high-capacity regime. We also argue for a different way of training these models for robust control of key properties. It was fun thinking about this with Fabio Viola arxiv.org/abs/1810.00597

Chelsea Finn (@chelseabfinn) 's Twitter Profile Photo

CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets arxiv.org/abs/1810.02334 w/ Kyle Hsu, Sergey Levine

CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets
arxiv.org/abs/1810.02334  
w/ <a href="/kylehkhsu/">Kyle Hsu</a>, <a href="/svlevine/">Sergey Levine</a>
Sam Finlayson (@iamsamfin) 's Twitter Profile Photo

As a budding student of ML, I often find myself re-googling things I've learned/forgotten many times. This afternoon, I decided to toss some favorite resources into one doc for speedy reference. Making it for my own use, but figured why not share the link sgfin.github.io/learning-resou…

Oriol Vinyals (@oriolvinyalsml) 's Twitter Profile Photo

Happy that we could share #AlphaStar progress with you all! Good Games @LiquidTLO and Grzegorz Komincz, and Dan Stemkoski and Kevin van der Kooi 🇺🇦 for a great show! You can see all the details in the blog. deepmind.com/blog/alphastar…

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

Our CodeGen models are now available at Hugging Face! (Model size variants: 350M, 2B, 6B, and 16B.) Clone the latest transformers repository and try it out! Paper: arxiv.org/abs/2203.13474 Models: huggingface.co/models?search=…

Our CodeGen models are now available at <a href="/huggingface/">Hugging Face</a>! (Model size variants: 350M, 2B, 6B, and 16B.)
Clone the latest transformers repository and try it out! 
Paper: arxiv.org/abs/2203.13474
Models: huggingface.co/models?search=…
Tian Han (@tianhan10) 's Twitter Profile Photo

Excited to share our oral presented work in NeurIPS2022 "Adaptive Multi-stage Density Ratio Estimation for Learning Latent Space Energy-based Model": arxiv.org/abs/2209.08739. Latent EBM is learned through multiple stages of density ratios via NCE (no mcmc). #NeurIPS2022 #ebms

Excited to share our oral presented work in NeurIPS2022 "Adaptive Multi-stage Density Ratio Estimation for Learning Latent Space Energy-based Model": arxiv.org/abs/2209.08739.  Latent EBM is learned through multiple stages of density ratios via NCE (no mcmc). #NeurIPS2022 #ebms
Bo Pang (@bo_pang0) 's Twitter Profile Photo

Thank you AK for highlighting our work! We show that it's possible to bootstrap long chain-of-thought reasoning without distillation from o1-like models. We detail the data curation procedures and training methods. Stay tuned for the release of our data and models.

Rui Zhang (@ruizhang_nlp) 's Twitter Profile Photo

🚀If you're looking for inference-time techniques to max out the reasoning ability of your local LLMs, check our #ICLR2025 paper GreaTer for gradient-based prompt optimization! We generate fluent & strategic prompts outperforming APO/APE/PE2/TextGrad on multiple reasoning

Rui Zhang (@ruizhang_nlp) 's Twitter Profile Photo

📢 GreaterPrompt is Now Live! We're excited to introduce GreaterPrompt, a unified, customizable, and high-performance open-source toolkit for prompt optimization. 🔍 Key Features: - 5 Optimization Methods: APO, APE, PE2, GReaTer, and TextGrad - 4 Model Families: GPT, Mistral,

Caiming Xiong (@caimingxiong) 's Twitter Profile Photo

Meet 🔥xGen-Small 🔥– our family of small LMs with long context is now open-sourced! - 128K token context - Outperforms Gemma3, Llama3.2, QWen2.5 with similar size. - 95.3% GSM8K & 91.6% MATH reasoning & 50.6% LiveCodeBench code generation 🤗Try now: huggingface.co/Salesforce/xge…

Meet 🔥xGen-Small 🔥– our family of small LMs with long context is now open-sourced!

 - 128K token context
 - Outperforms Gemma3, Llama3.2, QWen2.5 with similar size.
 - 95.3% GSM8K &amp; 91.6% MATH reasoning &amp; 50.6% LiveCodeBench code generation

🤗Try now: huggingface.co/Salesforce/xge…