OpenGVLab (@opengvlab) Twitter Tweets • TwiCopy

OpenGVLab

@opengvlab

+ Follow

Shanghai AI Lab, General Vision Team. We created InternImage, BEVFormer, VideoMAE, LLaMA-Adapter, Ask-Anything, and many more! [email protected]

ID: 1610948392489979904

linkhttps://github.com/OpenGVLab calendar_today05-01-2023 10:36:53

133 Tweet

1,1K Followers

87 Following

OpenGVLab

@opengvlab

a year ago

CharXiv is Zirui "Colin" Wang 's excellent work in evaluating the chart understanding ability of #mllm. InternVL2-Llama3-76B is the best open-source model for this domain. BTW the song that summarizes the key findings is creative! I love it! 👍CharXiv leaderbord and the song:

CharXiv is <a href="/zwcolin/">Zirui "Colin" Wang</a> 's excellent work in evaluating the chart understanding ability of #mllm. InternVL2-Llama3-76B is the best open-source model for this domain. BTW the song that summarizes the key findings is creative! I love it!
👍CharXiv leaderbord and the song:

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

OpenGVLab

@opengvlab

a year ago

🚀Code:github.com/Alpha-VLLM/Lum… 🤩Paper:huggingface.co/papers/2408.02…

thumb_up_off_alt19

chat_bubble_outline0

repeat4

shareShare

OpenGVLab

@opengvlab

a year ago

Flexible photo-realistic image and vision-language generalist using a simple decoder-only transformer! #GenAI model LUMINA-mGPT's demo video is on YouTube ! 📺youtu.be/YqNc8Y-cCs0?si… 🚀Code: github.com/Alpha-VLLM/Lum… Paper: arxiv.org/abs/2408.02657

thumb_up_off_alt5

chat_bubble_outline1

repeat3

shareShare

OpenGVLab

@opengvlab

a year ago

🥳Thanks for sharing our models! 🤗 github.com/OpenGVLab/Inte… 🤗huggingface.co/collections/Op…

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

OpenGVLab

@opengvlab

a year ago

Here comes the Mini-InternVL 2.0 ! 🚀With just 5% of the parameters, it delivers 90% performance! arxiv👏: arxiv.org/abs/2410.16261 repos👉: github.com/OpenGVLab/Inte… 1B version🤗: huggingface.co/OpenGVLab/Inte… 2B version🤗: huggingface.co/OpenGVLab/Inte… 4B version🤗:

thumb_up_off_alt97

chat_bubble_outline3

repeat32

shareShare

OpenGVLab

@opengvlab

a year ago

The tech report is worth reading. It reveals many details about how InternVL 1.5, InternVL 2.0, and now InternVL 2.5 can be the best open-source #vlm foundation model all the time. huggingface.co/papers/2412.05…

thumb_up_off_alt15

chat_bubble_outline1

repeat2

shareShare

OpenGVLab

@opengvlab

a year ago

🥳We have released InternVL2.5, ranging from 1B to 78B, on Hugging Face . 😉InternVL2_5-78B is the first open-source #MLLM to achieve over 70% on the MMMU benchmark, matching the performance of leading closed-source commercial models like GPT-4o. 🤗HF Space:

🥳We have released InternVL2.5, ranging from 1B to 78B, on <a href="/huggingface/">Hugging Face</a> .

😉InternVL2_5-78B is the first open-source #MLLM to achieve over 70% on the MMMU benchmark, matching the performance of leading closed-source commercial models like GPT-4o.

🤗HF Space:

thumb_up_off_alt183

chat_bubble_outline8

repeat50

shareShare

OpenGVLab

@opengvlab

a year ago

We have reached a milestone by exceeding human performance on the R2R dataset in vision-language navigation for the very first time.

thumb_up_off_alt20

chat_bubble_outline0

repeat4

shareShare

OpenGVLab

@opengvlab

a year ago

People pay more and more attention on the quality or details of generated videos. Using a single hand-tuning temperature parameter to enhance your generated video for free! Nice work with our amazing friends Yang Luo Xuanlei Zhao, Wenqi Shaw, Victor.Kai Wang, VITA Group,

thumb_up_off_alt22

chat_bubble_outline0

repeat2

shareShare

OpenGVLab

@opengvlab

10 months ago

🥳Mini-InternVL has been accepted by Visual Intelligence! The Mini-InternVL series of #MLLMs, with parameter ranges from 1 B to 4 B, achieve 90% of the performance using only 5% of the parameters. This significant efficiency and performance boost makes our model more accessible

thumb_up_off_alt32

chat_bubble_outline2

repeat6

shareShare

OpenGVLab

@opengvlab

9 months ago

🚀 Introducing #InternVideo 2.5 - The Video Multimodal AI That Sees Longer & Smarter! ✨ Handles videos 6x longer than predecessors ✨ Pinpoints objects/actions with surgical precision ✨ Trained on 300K+ hours of diverse video data 📈 Outperforms SOTA on multiple benchmarks &

thumb_up_off_alt75

chat_bubble_outline2

repeat11

shareShare

OpenGVLab

@opengvlab

9 months ago

🚀 Introducing MM-Eureka Series - A Breakthrough in Multimodal Reasoning with Visual Aha Moments! ✨ Reproduced R1-Zero and Visual Aha-Moment Phenomena 🧠 Trained on only 0.05% of the data used for base models, it achieves comparable benchmark math reasoning performance to

thumb_up_off_alt68

chat_bubble_outline4

repeat21

shareShare

OpenGVLab

@opengvlab

7 months ago

🥳We have released #InternVL3, an advanced #MLLM series ranging from 1B to 78B, on Hugging Face. 😉InternVL3-78B achieves a score of 72.2 on the MMMU benchmark, setting a new SOTA among open-source MLLMs. ☺️Highlights: - Native multimodal pre-training: Simultaneous language and

🥳We have released #InternVL3, an advanced #MLLM series ranging from 1B to 78B, on <a href="/huggingface/">Hugging Face</a>.

😉InternVL3-78B achieves a score of 72.2 on the MMMU benchmark, setting a new SOTA among open-source MLLMs.

☺️Highlights:
- Native multimodal pre-training: Simultaneous language and

thumb_up_off_alt162

chat_bubble_outline3

repeat49

shareShare