Sumit (@_reachsumit) 's Twitter Profile
Sumit

@_reachsumit

Senior ML Engineer @Meta | prev: @TikTok_us, @Amazon, @Samsung | UChicago Alum

blog.reachsumit.com

🇮🇳→🇰🇷→🇦🇺→🇨🇦→🇺🇲

ID: 129969370

linkhttps://recsys.substack.com/ calendar_today05-04-2010 23:22:47

7,7K Tweet

2,2K Takipçi

438 Takip Edilen

Sumit (@_reachsumit) 's Twitter Profile Photo

OneRec Technical Report Kuaishou presents an end-to-end generative recommendation system that achieves 23.7% and 28.8% MFU on flagship GPUs, reducing OPEX to only 10.6% of traditional pipelines while improving App Stay Time by 0.54% and 1.24%. 📝arxiv.org/abs/2506.13695

Sumit (@_reachsumit) 's Twitter Profile Photo

Hierarchical Group-wise Ranking Framework for Recommendation Models Credit Karma proposes a framework using residual vector quantization to create hierarchical user clusters for improved negative sampling in CTR/CVR models. 📝arxiv.org/abs/2506.12756

Sumit (@_reachsumit) 's Twitter Profile Photo

Device-Cloud Collaborative Correction for On-Device Recommendation Balances real-time performance and accuracy in device-based recommendations using self-correction networks on devices and global correction networks on cloud. 📝arxiv.org/abs/2506.12687 👨🏽‍💻github.com/Yuzt-zju/CoCor…

Sumit (@_reachsumit) 's Twitter Profile Photo

FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation Presents an open-source framework supporting text-based, multimodal, and web-based RAG with asynchronous processing, and persistent caching . 📝arxiv.org/abs/2506.12494 👨🏽‍💻github.com/ictnlp/FlexRAG

Sumit (@_reachsumit) 's Twitter Profile Photo

Maximally-Informative Retrieval for State Space Model Generation Proposes a retrieval method that uses gradients from LLMs to learn optimal document mixtures for answer generation, minimizing model uncertainty through direct feedback. 📝arxiv.org/abs/2506.12149

Sumit (@_reachsumit) 's Twitter Profile Photo

T2-RAGBench: Text-and-Table Benchmark for Evaluating Retrieval-Augmented Generation Presents a benchmark with 32,908 question-context-answer triples from financial documents to evaluate RAG methods on mixed text-and-table data 📝arxiv.org/abs/2506.12071 👨🏽‍💻anonymous.4open.science/r/g4kmu-paper-…

Sumit (@_reachsumit) 's Twitter Profile Photo

Are We Really Measuring Progress? Transferring Insights from Evaluating Recommender Systems to Temporal Link Prediction Identifies critical evaluation issues in temporal link prediction benchmarks. 📝arxiv.org/abs/2506.12588

Sumit (@_reachsumit) 's Twitter Profile Photo

EnhanceGraph: A Continuously Enhanced Graph-based Index for High-dimensional Approximate Nearest Neighbor Search Leverages search and construction logs to continuously improve graph-based ANNS indexes. 📝arxiv.org/abs/2506.12071… 👨🏽‍💻github.com/antgroup/vsag

Sumit (@_reachsumit) 's Twitter Profile Photo

Towards Building General Purpose Embedding Models for Industry 4.0 Agents IBM presents a framework for building specialized embedding models for industrial asset maintenance, integrating with ReAct agents to guide engineer decisions. 📝arxiv.org/abs/2506.12607

Sumit (@_reachsumit) 's Twitter Profile Photo

A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications Presents a survey examining AI-powered research automation systems, analyzing 80+ implementations and proposing a hierarchical taxonomy for Deep Research capabilities. 📝arxiv.org/abs/2506.12594

Sumit (@_reachsumit) 's Twitter Profile Photo

Similarity = Value? Consultation Value Assessment and Alignment for Personalized Search Introduces a consultation value assessment framework that evaluates historical consultations for improved personalized search. 📝arxiv.org/abs/2506.14437 👨🏽‍💻anonymous.4open.science/r/VAPS-to-go/

Sumit (@_reachsumit) 's Twitter Profile Photo

InsertRank: LLMs can reason over BM25 scores to Improve Listwise Reranking Presents a listwise reranking method that leverages BM25 scores during reranking to improve retrieval effectiveness on reasoning-centric queries across multiple LLM families. 📝arxiv.org/abs/2506.14086

Sumit (@_reachsumit) 's Twitter Profile Photo

XGraphRAG: Interactive Visual Analysis for Graph-based Retrieval-Augmented Generation Proposes a visual analysis framework to help developers identify critical recalls and trace them through the GraphRAG pipeline. 📝arxiv.org/abs/2506.13782 👨🏽‍💻github.com/Gk0Wk/XGraphRAG

Sumit (@_reachsumit) 's Twitter Profile Photo

Knowledge Compression via Question Generation: Enhancing Multihop Document Retrieval without Fine-tuning Eponon anvi alex et al. present a question-based knowledge encoding approach to enhance RAG performance without model fine-tuning 📝arxiv.org/abs/2506.13778 👨🏽‍💻github.com/anvix9/llama2-…

Sumit (@_reachsumit) 's Twitter Profile Photo

HARMONY: A Scalable Distributed Vector Database for High-Throughput Approximate Nearest Neighbor Search Introduces a distributed ANNS system that combines dimension-based and vector-based partitioning with early-stop pruning. 📝arxiv.org/abs/2506.14707

Sumit (@_reachsumit) 's Twitter Profile Photo

DiscRec: Disentangled Semantic-Collaborative Modeling for Generative Recommendation Introduces a framework that separates semantic and collaborative signals in generative recommendation. 📝arxiv.org/abs/2506.15576 👨🏽‍💻github.com/Ten-Mao/DiscRec

Sumit (@_reachsumit) 's Twitter Profile Photo

Multi-Interest Recommendation: A Survey Provides a comprehensive survey of multi-interest recommendation methods, systematically reviewing approaches that model users' multifaceted preferences. 📝arxiv.org/abs/2506.15284 👨🏽‍💻github.com/WHUIR/Multi-In…

Sumit (@_reachsumit) 's Twitter Profile Photo

Next-User Retrieval: Enhancing Cold-Start Recommendations via Generative Next-User Modeling ByteDance presents a transformer-based approach for cold-start recommendation that generates next potential users to address item cold-start challenges. 📝arxiv.org/abs/2506.15267

Sumit (@_reachsumit) 's Twitter Profile Photo

Advancing Loss Functions in Recommender Systems: A Comparative Study with a Rényi Divergence-Based Solution Proposes a new loss function for recommender systems that combines advantages of Softmax and Cosine Contrastive Losses 📝arxiv.org/abs/2506.15120 👨🏽‍💻github.com/cynthia-shengj…

Sumit (@_reachsumit) 's Twitter Profile Photo

cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree Presents AST-based chunking for code RAG that preserves syntactic structure, improving retrieval and generation performance. 📝arxiv.org/abs/2506.15655 👨🏽‍💻github.com/yilinjz/astchu…