Papers with Datasets (@paperswithdata) 's Twitter Profile
Papers with Datasets

@paperswithdata

Keep up with the latest machine learning datasets from @paperswithcode. Follow for daily updates.

ID: 1381544381509947393

linkhttps://paperswithcode.com/datasets calendar_today12-04-2021 09:47:43

284 Tweet

16,16K Takipçi

1 Takip Edilen

Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

👀Perception Test: a benchmark for evaluating the perception and reasoning skills of multimodal models. It uses real-world videos to define tasks that require understanding of memory, patterns, physics, and semantics across visual, audio, and text. paperswithcode.com/dataset/percep…

👀Perception Test: a benchmark for evaluating the perception and reasoning skills of multimodal models. 

It uses real-world videos to define tasks that require understanding of memory, patterns, physics, and semantics across visual, audio, and text.

paperswithcode.com/dataset/percep…
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

🌄Diffusion DB: a large text-to-image prompt dataset containing 2M images generated by Stable Diffusion using user’s prompts and hyperparameters. It opens up research in prompt engineering, deepfakes detection, and designing public AI tools. paperswithcode.com/dataset/diffus…

🌄Diffusion DB: a large text-to-image prompt dataset containing 2M images generated by Stable Diffusion using user’s prompts and hyperparameters.

It opens up research in prompt engineering, deepfakes detection, and designing public AI tools.

paperswithcode.com/dataset/diffus…
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

🏺Breaking Bad: a large-scale dataset of fractured objects. It has 10k meshes, each with 100 fractures mode, totalling 1M breakdown patterns. It serves as a benchmark for fractured object reassembly and new challenges for geometric shape understanding. paperswithcode.com/dataset/breaki…

🏺Breaking Bad: a large-scale dataset of fractured objects. It has 10k meshes, each with 100 fractures mode, totalling 1M breakdown patterns. 

It serves as a benchmark for fractured object reassembly and new challenges for geometric shape understanding.

paperswithcode.com/dataset/breaki…
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

💻The Stack: a dataset for pre-training Code LLMs. It contains 3TB of permissively-licensed code in 30 programming languages. It’s created as part of the BigCode Project, an open scientific collaboration working on responsible development of Code LLMs. paperswithcode.com/dataset/the-st…

💻The Stack: a dataset for pre-training Code LLMs. It contains 3TB of permissively-licensed code in 30 programming languages.

It’s created as part of the BigCode Project, an open scientific collaboration working on responsible development of Code LLMs.

paperswithcode.com/dataset/the-st…
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

{}Code Syntax: a dataset of programs annotated with the syntactic relationships in their syntax trees designed for code syntax understanding tasks. It contains 30K+ code samples annotated with 2M+ relation edges in 43 relation types for Python and Java. paperswithcode.com/dataset/codesy…

{}Code Syntax: a dataset of programs annotated with the syntactic relationships in their syntax trees designed for code syntax understanding tasks.

It contains 30K+ code samples annotated with 2M+ relation edges in 43 relation types for Python and Java.

paperswithcode.com/dataset/codesy…
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

➗Lila: a mathematical reasoning benchmark consisting 23 tasks covering mathematical abilities, language format, language diversity and external knowledge. It's an extension of 20 datasets by collecting Python programs’ task instructions and solutions. paperswithcode.com/dataset/lila

➗Lila: a mathematical reasoning benchmark consisting 23 tasks covering mathematical abilities, language format, language diversity and external knowledge.

It's an extension of 20 datasets by collecting Python programs’ task instructions and solutions.

paperswithcode.com/dataset/lila
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

🏞ImageNet-X: a set of annotations of factors like pose, background, or lighting of the ImageNet validation set and 12K training images for studying the types of mistakes as a function of a model's architecture, learning paradigm, and training procedures. paperswithcode.com/dataset/imagen…

🏞ImageNet-X: a set of annotations of factors like pose, background, or lighting of the ImageNet validation set and 12K training images for studying the types of mistakes as a function of a model's architecture, learning paradigm, and training procedures.

paperswithcode.com/dataset/imagen…
Papers with Datasets (@paperswithdata) 's Twitter Profile Photo

🦿QDax: a Quality-Diversity benchmark suite for Deep Neuroevolution in Reinforcement Learning domain for robot control. It specifies different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. paperswithcode.com/dataset/qualit…

🦿QDax: a Quality-Diversity benchmark suite for Deep Neuroevolution in Reinforcement Learning domain for robot control. 

It specifies different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. 

paperswithcode.com/dataset/qualit…
Papers with Code (@paperswithcode) 's Twitter Profile Photo

🪐 Introducing Galactica. A large language model for science. Can summarize academic literature, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more. Explore and get weights: galactica.org