Prompt Engineer (@prompt48) Twitter Tweets • TwiCopy

Prompt Engineer

2 months ago

🚀 Cloud GPU. Local LLM. One Secure Tunnel. I just dropped a video showing how I run OpenClaw on a RunPod VPS while using LLMs running locally on my Windows machine via Ollama — connected using reverse SSH tunneling. OpenClaw🦞 ollama Runpod youtu.be/GYW4S41li64

thumb_up_off_alt0

chat_bubble_outline1

repeat0

shareShare

OpenAI

@openai

2 months ago

We’re starting to roll out a test for ads in ChatGPT today to a subset of free and Go users in the U.S. Ads do not influence ChatGPT’s answers. Ads are labeled as sponsored and visually separate from the response. Our goal is to give everyone access to ChatGPT for free with

thumb_up_off_alt3,3K

chat_bubble_outline964

repeat322

shareShare

Z.ai

@zai_org

2 months ago

Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens.

thumb_up_off_alt5,5K

chat_bubble_outline297

repeat778

shareShare

Unsloth AI

@unslothai

2 months ago

Z.ai Congrats guys on release & thank you for supporting open-source! 👏 🥰 We uploaded GLM-5 GGUFs so people can run it locally: huggingface.co/unsloth/GLM-5-…

thumb_up_off_alt229

chat_bubble_outline13

repeat18

shareShare

Nithur

@nithurm

2 months ago

My little AI job board passed $5000 in revenue! 😭😭 moaijobs.com

thumb_up_off_alt410

chat_bubble_outline64

repeat4

shareShare

ollama

@ollama

2 months ago

MiniMax M2.5 is on Ollama's cloud! ollama run minimax-m2.5:cloud Use MiniMax M2.5 with OpenCode, Claude Code, Codex, OpenClaw via ollama launch! OpenCode: ollama launch opencode --model minimax-m2.5:cloud Claude: ollama launch claude --model glm-5:cloud

thumb_up_off_alt619

chat_bubble_outline22

repeat55

shareShare

Logan Kilpatrick

@officiallogank

2 months ago

Gemini Deep Think 3 is the world's most capable model by many measures, huge amounts of progress on reasoning benchmarks and more. Available right now via the Gemini App for Ultra subscribers and in the API soon : )

thumb_up_off_alt2,2K

chat_bubble_outline206

repeat118

shareShare

OpenAI Developers

@openaidevs

2 months ago

Introducing GPT-5.3-Codex-Spark, our ultra-fast model purpose built for real-time coding. We’re rolling it out as a research preview for ChatGPT Pro users in the Codex app, Codex CLI, and IDE extension.

thumb_up_off_alt4,4K

chat_bubble_outline195

repeat487

shareShare

Joon Sung Park

@joon_s_pk

2 months ago

Introducing Simile. Simulating human behavior is one of the most consequential and technically difficult problems of our time. We raised $100M from Index, Hanabi, A* BCV, Andrej Karpathy Fei-Fei Li Adam D'Angelo Guillermo Rauch scott belsky among others.

thumb_up_off_alt5,5K

chat_bubble_outline375

repeat569

shareShare

Akshay 🚀

@akshay_pachaar

2 months ago

Meta just solved the biggest problem in RAG! Most RAG systems waste your money. They retrieve 100 chunks when you only need 10. They force the LLM to process thousands of irrelevant tokens. You pay for compute you don't need. Meta AI just solved this. They built REFRAG, a new

thumb_up_off_alt422

chat_bubble_outline40

repeat50

shareShare

elvis

@omarsar0

2 months ago

Just incredible that this is possible today. One of my favorite MCP tools as of late. Just prompt to generate beautiful excalidraw diagrams.

thumb_up_off_alt388

chat_bubble_outline21

repeat24

shareShare

Unsloth AI

@unslothai

2 months ago

You can now run MiniMax-2.5 locally! 🚀 At 230B parameters, MiniMax-2.5 is the strongest LLM under 700B params, delivering SOTA agentic coding & chat. Run Dynamic 3/4-bit on a 128GB Mac for 20 tokens/s. Guide: unsloth.ai/docs/models/mi… GGUF: huggingface.co/unsloth/MiniMa…

thumb_up_off_alt1,1K

chat_bubble_outline66

repeat133

shareShare

Sam Altman

@sama

2 months ago

Peter Steinberger is joining OpenAI to drive the next generation of personal agents. He is a genius with a lot of amazing ideas about the future of very smart agents interacting with each other to do very useful things for people. We expect this will quickly become core to our

thumb_up_off_alt13,13K

chat_bubble_outline1,1K

repeat1,1K

shareShare

Chao Huang

@huang_chao4969

2 months ago

Introducing ClawWork 🚀: Transform your openclaw/nanobot from AI assistant into a money-earning AI coworker. Watch it earn 💰$10K+ in just 7 hours by completing real professional tasks across 44+ industries — from Technology & Engineering to Business & Finance, Healthcare &

thumb_up_off_alt1,1K

chat_bubble_outline75

repeat257

shareShare

LMSYS Org

@lmsysorg

2 months ago

🎉 Meet Qwen3.5-397B-A17B from Qwen, 397B total params (17B active), built for real-world multimodal intelligence — day-0 support is now live in SGLang! 👁️ Unified vision-language foundation (early fusion): stronger reasoning, coding & agents ⚡ Gated DeltaNet + sparse

🎉 Meet Qwen3.5-397B-A17B from <a href="/Alibaba_Qwen/">Qwen</a>, 397B total params (17B active), built for real-world multimodal intelligence — day-0 support is now live in SGLang!

👁️ Unified vision-language foundation (early fusion): stronger reasoning, coding & agents
⚡ Gated DeltaNet + sparse

thumb_up_off_alt69

chat_bubble_outline1

repeat9

shareShare

vLLM

@vllm_project

2 months ago

🎉 Congrats to Qwen on releasing Qwen3.5 on Chinese New Year's Eve — day-0 support is ready in vLLM! Qwen3.5 is a multimodal MoE with Gated Delta Networks architecture — 397B total params, only 17B active. What makes it interesting for inference: 🧠 Gated Delta

🎉 Congrats to <a href="/Alibaba_Qwen/">Qwen</a> on releasing Qwen3.5 on Chinese New Year's Eve — day-0 support is ready in vLLM!

Qwen3.5 is a multimodal MoE with Gated Delta Networks architecture — 397B total params, only 17B active.

What makes it interesting for inference:

🧠 Gated Delta

thumb_up_off_alt309

chat_bubble_outline11

repeat34

shareShare

Hasan Toor ✪

@hasantoxr

2 months ago

🚨BREAKING: The "Ollama for voice cloning" just dropped. It's called Voicebox and it clones any voice from just a few seconds of audio entirely on your machine. No ElevenLabs subscription. No cloud uploads. No voice data leaving your device. It's powered by Qwen3-TTS,

thumb_up_off_alt3,3K

chat_bubble_outline67

repeat409

shareShare