Mohammad Rastegari (@morastegari) 's Twitter Profile
Mohammad Rastegari

@morastegari

Distinguished AI Scientist at Meta. Affiliate Assistant Professor at University of Washington.

ID: 826924323113807872

calendar_today01-02-2017 22:44:57

109 Tweet

1,1K Takipçi

114 Takip Edilen

arXiv Daily (@arxiv_daily) 's Twitter Profile Photo

DKM: Differentiable K-Means Clustering Layer for Neural Network Compression deepai.org/publication/dk… by Minsik Cho et al. including Mohammad Rastegari #KMeans #NaturalLanguageProcessing

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

Deploying ML models on device cannot be static. ML models should adapt themselves to available resource. In our recent research in #Apple, we learn ML models that can dynamically be compressed to any arbitrary sparsity or quantized level at inference time.

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

Yes, we are releasing code in Apple to promote effective contribution in our research community. #apple_ml_research, #apple, #DeepLearning

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

Wow deep attacks seems to be a very serious problem in our ML modeling approach. Perhaps something is fundamentally wrong in our models!!! arxiv.org/abs/1910.00744 #deeplearning

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

Great progress in self-supervised training. Next milestone matching supervised SOAT Imagenet1k with self-supervised without end-to-end fine tuning

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

These CVPR policies are frustrating. Given all the randomness in the review process I feel there is no point submitting paper into conferences anymore. By the law of large data (large number of papers and readers) just submitting to arXiv will be enough for a good paper to shine.

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

A Transformer that can be as efficient as a CNN but yet maintains the high performance in the large data regime training. #EfficientTransformer, #AppleMI

Anurag Ranjan (@anuragranj) 's Twitter Profile Photo

We provide an empirical analysis over different sharing strategies in isotropic networks and how they can make large networks memory-efficient. Joint work with Chien-Yu Lin, Anish Prabhu, Thomas Merth, Sachin, Maxwell Horton, Mohammad Rastegari. Code: github.com/apple/ml-spin.

Anurag Ranjan (@anuragranj) 's Twitter Profile Photo

Introducing NeuMan, a NeRF representation of human together with the scene. From a single clip (<100 frames), NeuMan can perform view synthesis of the scene without/with the human in novel poses. (1/4) project page: machinelearning.apple.com/research/neura…… code: github.com/apple/ml-neuman

Oncel Tuzel (@onceltuzel) 's Twitter Profile Photo

NeuMan is a new #ECCV2022 paper from our research team Apple. Using a short (~10s) clip, we reconstruct human and scene radiance fields, and re-render with novel human poses and views. Paper/code/videos: machinelearning.apple.com/research/neura… w. Wei Jiang, G. Samei, Kwang Moo Yi, Anurag Ranjan

Oncel Tuzel (@onceltuzel) 's Twitter Profile Photo

“Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement” is an #iccv2023 paper from #Apple. By just swapping the ImageNet dataset with the “reinforced” ImageNet+ dataset, a model can be trained up to 7x faster to reach the same accuracy

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

Accurate training aware weight quantization was computationally intractable for LLMs. But now in Apple MIND we developed a method to solve the problem very efficiently and it pushes the boundary to 3-bit quantization. eDKM: arxiv.org/abs/2309.00964 #LLM, #LLMoptimizaton

Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

This has been one of my favorite directions on enabling #llms to run effectively on device. Thanks to the great team who are pushing state-of-the-art in this direction. In the Apple MIND team, we try to attack research problems that move us to the next level of experiencing AI.

AK (@_akhaliq) 's Twitter Profile Photo

Apple presents Speculative Streaming Fast LLM Inference without Auxiliary Models Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model. While effective, in application-specific

Apple presents Speculative Streaming

Fast LLM Inference without Auxiliary Models

Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model. While effective, in application-specific
Mohammad Rastegari (@morastegari) 's Twitter Profile Photo

This work was one of the last works that was done by my team when I was working at Apple. A lot of credit to Sachin whose dedication was the key to this project. Main point behind here is to show as a contributor to the AI community we play our role to be fully open.