Bo Pang (@bo_pang0) Twitter Tweets • TwiCopy

Russ Salakhutdinov

7 years ago

#nips2018 paper: GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations: Learning generic latent relational graphs between words, pixels from unlabeled data & transferring the graphs to downstream tasks: arxiv.org/abs/1806.05662 w/t Z. Yang, J. Zhao et al.

thumb_up_off_alt360

chat_bubble_outline0

repeat112

shareShare

Danilo J. Rezende

@danilojrezende

7 years ago

Taming VAEs: A theoretical analysis of their properties and behaviour in the high-capacity regime. We also argue for a different way of training these models for robust control of key properties. It was fun thinking about this with Fabio Viola arxiv.org/abs/1810.00597

thumb_up_off_alt230

chat_bubble_outline0

repeat72

shareShare

Yann LeCun

@ylecun

7 years ago

Causality workshop at NIPS: awesome program. facebook.com/722677142/post…

thumb_up_off_alt74

chat_bubble_outline0

repeat21

shareShare

Apache TVM

@apachetvm

7 years ago

Automatic Kernel Optimization for Deep Learning on All Hardware Platforms tvm.ai/2018/10/03/aut…

thumb_up_off_alt59

chat_bubble_outline1

repeat22

shareShare

Chelsea Finn

@chelseabfinn

7 years ago

CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets arxiv.org/abs/1810.02334 w/ Kyle Hsu, Sergey Levine

thumb_up_off_alt441

chat_bubble_outline0

repeat116

shareShare

Sam Finlayson

@iamsamfin

7 years ago

As a budding student of ML, I often find myself re-googling things I've learned/forgotten many times. This afternoon, I decided to toss some favorite resources into one doc for speedy reference. Making it for my own use, but figured why not share the link sgfin.github.io/learning-resou…

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat401

shareShare

Adam R. Kosiorek

@arkosiorek

7 years ago

How do you handle the complexity of machine learning experiments? Here's how I do it: akosiorek.github.io/ml/2018/11/28/…

thumb_up_off_alt86

chat_bubble_outline1

repeat18

shareShare

MachinePix

@machinepix

7 years ago

AWC-150 automatic woks.

thumb_up_off_alt1,1K

chat_bubble_outline35

repeat575

shareShare

Fermat's Library

@fermatslibrary

7 years ago

Here's a useful counterintuitive fact: one 18 inch pizza has more 'pizza' than two 12 inch pizzas

thumb_up_off_alt60,60K

chat_bubble_outline1,1K

repeat25,25K

shareShare

Oriol Vinyals

@oriolvinyalsml

7 years ago

Happy that we could share #AlphaStar progress with you all! Good Games @LiquidTLO and Grzegorz Komincz, and Dan Stemkoski and Kevin van der Kooi 🇺🇦 for a great show! You can see all the details in the blog. deepmind.com/blog/alphastar…

thumb_up_off_alt1,1K

chat_bubble_outline44

repeat432

shareShare

Salesforce AI Research

@sfresearch

3 years ago

Our CodeGen models are now available at Hugging Face! (Model size variants: 350M, 2B, 6B, and 16B.) Clone the latest transformers repository and try it out! Paper: arxiv.org/abs/2203.13474 Models: huggingface.co/models?search=…

Our CodeGen models are now available at <a href="/huggingface/">Hugging Face</a>! (Model size variants: 350M, 2B, 6B, and 16B.)
Clone the latest transformers repository and try it out!
Paper: arxiv.org/abs/2203.13474
Models: huggingface.co/models?search=…

thumb_up_off_alt353

chat_bubble_outline6

repeat75

shareShare

Tian Han

@tianhan10

3 years ago

Excited to share our oral presented work in NeurIPS2022 "Adaptive Multi-stage Density Ratio Estimation for Learning Latent Space Energy-based Model": arxiv.org/abs/2209.08739. Latent EBM is learned through multiple stages of density ratios via NCE (no mcmc). #NeurIPS2022 #ebms

thumb_up_off_alt15

chat_bubble_outline1

repeat3

shareShare

Bo Pang

@bo_pang0

8 months ago

Thank you AK for highlighting our work! We show that it's possible to bootstrap long chain-of-thought reasoning without distillation from o1-like models. We detail the data curation procedures and training methods. Stay tuned for the release of our data and models.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Rui Zhang

@ruizhang_nlp

6 months ago

🚀If you're looking for inference-time techniques to max out the reasoning ability of your local LLMs, check our #ICLR2025 paper GreaTer for gradient-based prompt optimization! We generate fluent & strategic prompts outperforming APO/APE/PE2/TextGrad on multiple reasoning

thumb_up_off_alt15

chat_bubble_outline0

repeat8

shareShare

Rui Zhang

@ruizhang_nlp

6 months ago

📢 GreaterPrompt is Now Live! We're excited to introduce GreaterPrompt, a unified, customizable, and high-performance open-source toolkit for prompt optimization. 🔍 Key Features: - 5 Optimization Methods: APO, APE, PE2, GReaTer, and TextGrad - 4 Model Families: GPT, Mistral,

thumb_up_off_alt21

chat_bubble_outline0

repeat6

shareShare

Caiming Xiong

@caimingxiong

5 months ago

Meet 🔥xGen-Small 🔥– our family of small LMs with long context is now open-sourced! - 128K token context - Outperforms Gemma3, Llama3.2, QWen2.5 with similar size. - 95.3% GSM8K & 91.6% MATH reasoning & 50.6% LiveCodeBench code generation 🤗Try now: huggingface.co/Salesforce/xge…

thumb_up_off_alt113

chat_bubble_outline4

repeat25

shareShare