Ziji Shi (@shi23steven) Twitter Tweets • TwiCopy

Ziji Shi

@shi23steven

+ Follow

Ph.D. student at @NUScomputing advised by @JialinLiNUS, intern @Google XLA. Ex- @Apple @alibaba_cloud & @sensetime_ai. I build highly efficient systems for ML.

ID: 1109155836981465089

linkhttp://zijishi.xyz calendar_today22-03-2019 18:12:13

33 Tweet

199 Followers

290 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

RIP to my friend, colleague, and AI visionary Nils Nilsson. Your work on the A* algorithm has improved countless lives (this is how we find the shortest path from A to B). I will always remember your work, but even more importantly your kindness. ai.stanford.edu/~nilsson/

thumb_up_off_alt2,2K

chat_bubble_outline35

repeat512

shareShare

Ziji Shi

@shi23steven

4 years ago

Thanks!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

3 years ago

Our paper on model parallel framework #EasyParallellLibrary (#EPL) was accepted by #ATC’22! Many thanks to the reviewer and co-authors. Stay tuned for exciting updates! GitHub: github.com/alibaba/EasyPa…

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

3 years ago

Saddened that due to visa disapproval, I won’t be able to attend the #ATC conference to present our work. That said, I am very thankful to USENIX Association and Noa Zilberman for helping me on this matter. Hope to see you next year!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

3 years ago

Very insightful!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

3 years ago

Great explanation! I also had the same question regarding tensor core. btw, TPU also uses systolic array, but some argue that systolic array is not ideal due to the lack of flexibility. Maybe the future belongs to RISC-V arch😄

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Ziji Shi

@shi23steven

3 years ago

Some notes that I took while attending the ACM Summer School on HPC at Barcelona: zijishi.xyz/post/optimizat…

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Tianqi Chen

@tqchenml

3 years ago

Reminder to enroll DLSysCourse to build your own deep learning framework from scratch!

thumb_up_off_alt24

chat_bubble_outline2

repeat5

shareShare

Ziji Shi

@shi23steven

3 years ago

Just received my travel grant for the HPC summer school from Association for Computing Machinery. Much appreciated! The notes can be found here: zijishi.xyz/post/optimizat…

Just received my travel grant for the HPC summer school from <a href="/TheOfficialACM/">Association for Computing Machinery</a>. Much appreciated! The notes can be found here: zijishi.xyz/post/optimizat…

thumb_up_off_alt4

chat_bubble_outline2

repeat0

shareShare

Ziji Shi

@shi23steven

3 years ago

Eagerly expecting a break after the ASPLOS deadline...

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

cs.LG Papers

@arxiv_cs_lg

2 years ago

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation. Ziji Shi, Le Jiang, Ang Wang, Jie Zhang, Xianyan Jia, Yong Li, Chencan Wu, Jialin Li, and Wei Lin arxiv.org/abs/2302.00247

thumb_up_off_alt0

chat_bubble_outline1

repeat1

shareShare

Ziji Shi

@shi23steven

2 years ago

#ChatGPT has been phenomenal, but have you ever wondered how it was trained? In fact, finding the optimal parallel strategy for such LLM is very challenging, as the candidate space grows exponentially w.r.t size. (1/2)

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

2 years ago

We recently uploaded our work with Alibaba on *quickly* and *automatically* finding the optimal #tensorparallel strategy for #LLM. Compared to SoTA approaches, we are ~20-160x faster. Comments are welcomed! Arxiv: arxiv.org/abs/2302.00247

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Zijiao Chen

@zijiaoc

2 years ago

🧵🧠 We're witnessing incredible scientific progress in image & text reconstruction from fMRI nowadays. But what about reconstructing video from fMRI? Allow me to introduce our recent preprint: Mind-Video arxiv.org/abs/2305.11675 mind-video.com drive.google.com/drive/folders/…

thumb_up_off_alt586

chat_bubble_outline38

repeat191

shareShare

Ziji Shi

@shi23steven

2 years ago

(1/2) As large-scale models continue to evolve, the need for associated foundational systems is also growing. We've set up an MLSys discussion group (mlsys-sg.org), planning to host bi-weekly discussions on academic papers or updates on cutting-edge advancements.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

2 years ago

(2/2) We warmly welcome all professionals in the field to join us, engage in enriching conversations, and contribute to our vibrant community! #LLM #Singapore #MLSys #AI #CommunityBuilding 🤝

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Ziji Shi

@shi23steven

2 years ago

I’m attending #ISCA co-located with #FCRC 🎉 We will present two papers at the MlArchSys and ASSYST workshop on #LLM and #GAN at Canary 2. Feel free to drop by and say hi! ISCA

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Fuzhao Xue (Frio)

@xuefz

2 years ago

1/ Announcing the development of OpenMoE project! 🚀 Open Mixture-of-Experts Language Models! MoE + UL2 objective + umT5 tokenizer + 50% code data mix. GitHub: github.com/XueFuzhao/Open… Blog: xuefuzhao.notion.site/Aug-2023-OpenM…

thumb_up_off_alt526

chat_bubble_outline10

repeat105

shareShare

HPC Papers

@hpcpapers

8 months ago

ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks by Ziji Shi et al. arxiv.org/abs/2411.03999…

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

DeepSpeed

@deepspeedai

7 months ago

Introducing Domino: a novel zero-cost communication tensor parallelism (TP) training engine for both single node and multi-node settings. - Near-complete communication hiding - Novel multi-node scalable TP solution Blog: github.com/microsoft/Deep…

thumb_up_off_alt205

chat_bubble_outline0

repeat68

shareShare

Ziji Shi

Gate.io

Andrew Ng

Ziji Shi

Ziji Shi

Ziji Shi

Ziji Shi

Ziji Shi

Ziji Shi

Tianqi Chen

Ziji Shi

Ziji Shi

cs.LG Papers

Ziji Shi

Ziji Shi

Zijiao Chen

Ziji Shi

Ziji Shi

Ziji Shi

Fuzhao Xue (Frio)

HPC Papers

DeepSpeed