Author: Stark, Tony

  • ✨ Alibaba Drops a Game-Changer! The Ultimate Desktop AI Assistant for Every Worker Is Here!! 💻💫

    家人们,今天刷到阿里刚发布的【QoderWork】
    本效率废柴直接瞳孔地震🤯
    不用写代码!不用传云端!
    对着电脑说句“整理销售表+出PPT”
    它!真!的!秒!搞!定!了!!(附脑补界面图👉超简洁对话框+进度条狂飙)

    🔥为什么我疯狂心动?
    ✅ 本地运行超安心!文件死守你电脑💻
    (再也不用担心机密表格乱飞!打工人安全感拉满🔒)
    ✅ 智能到像读心术🧠
    “分析Excel,找销量TOP5+画趋势图+写报告”
    它自动拆步骤:清洗→计算→生成图表→输出带结论的PPT!
    ✅ 音频秒变爆款素材🎤
    采访录音丢过去👉自动生成小红书笔记+字幕+公众号长文!
    (博主/运营人直接跪谢!!省下3小时剪辑命😭)
    ✅ 还能自己加“技能”✨
    内置超多工具,定制专属工作流~越用越懂你!

    💬阿里大佬原话戳中泪点:
    “让AI走出聊天框,真正帮你干活!”
    本打工人含泪点赞:终于不用在Excel里反复横跳了啊啊啊!!

    🌟真实使用脑补:
    早上咖啡没凉☕️
    它已把周报写好+配好图+标红重点
    我:???这真是我的电脑??(感动到想给它颁锦旗🇨🇳)

    ⚠️划重点:
    目前开放邀测!手慢无!!
    (蹲到的家人们评论区吼我!!求组队冲🏃‍♀️💨)
    👉指路:阿里Qoder官网(记得搜“QoderWork”!)

    💬互动时间:
    你最想让它帮你干啥?
    ▫️自动回邮件?▫️整理手机相册?▫️写小红书文案?
    👇评论区甩需求!点赞过百我求阿里加更教程!!

    #打工人续命神器 #AI办公天花板 #阿里黑科技真香 #效率开挂日记 #职场妈妈必备

    ✨关注我,带你挖遍让生活变甜的神仙工具!✨

  • 🦞 Went viral overnight! Is this "JARVIS living inside your computer" the true AI gateway?

    家人们,最近北美 tech 圈疯传一个开源项目——Clawdbot(现在刚改名叫 Moltbot,但大家还是习惯叫它 Clawdbot),被无数人称为 “本地版贾维斯”

    它不像你手机里那些 AI 聊天 App,而是直接住进你的 Mac / 服务器里,通过 Telegram、Slack、iMessage 这些你每天都在用的聊天软件跟你对话,还能操作你的文件、终端、浏览器……甚至帮你自动填报销单!🤯

    今天就来手把手带你上车 + 拆解它为什么这么特别👇


    🔍 它到底是什么?

    Clawdbot 是由知名开发者 Peter Steinberger(@steipete)发起的开源项目,定位是 Personal AI Assistant(个人 AI 助手)The

    但它最牛的地方在于:
    ✅ 本地优先(Local-first):所有数据存在你自己的电脑里
    ✅ 无独立 App:通过你已有的聊天工具交互(比如 Telegram)
    ✅ 能动手做事:不只是聊天,还能执行真实任务!

    简单说:它不是“另一个 AI 聊天窗口”,而是你电脑里的数字员工💼


    💻 手把手部署指南(超详细!)

    ✅ 前期准备

    • 一台 Mac / Linux / WSL2 机器
    • Node.js ≥ v20(推荐)
    • Bun(可选但超快!作者亲推✨)
    • 一个 LLM API Key(Claude / GPT / Gemini 都行)
    • 建议从 Telegram 入手!配置最简单~

    🚀 三步启动

    # 1. 克隆项目
    git clone https://github.com/clawdbot/clawdbot.git
    cd clawdbot
    
    # 2. 安装依赖(用 Bun 更快!)
    bun install
    
    # 3. 创建 .env 文件

    在 .env 里填入:

    编辑 env

    ANTHROPIC_API_KEY=你的 Claude Key
    TELEGRAM_BOT_TOKEN=从 @BotFather 拿到的 Token
    TELEGRAM_ALLOWED_USER_IDS=你的 Telegram ID(防别人白嫖!)

    然后运行:

    编辑 bash

    bun run dev

    📱 手机远程控制?

    对!你可以在手机 Telegram 里给机器人发消息:

    “ping”

    如果它回你 “pong” —— 恭喜!连接成功🎉
    从此你就能用手机指挥家里的 Mac 干活了

    ⚠️ iMessage / WhatsApp 也能接,但需要额外配置,新手先玩 Telegram 就好~


    🦾 解锁“手脚”:技能系统太香了!

    Clawdbot 本身是个“躯干”,真正厉害的是它的 Skills(技能包)

    比如你可以让它:

    “帮我列出桌面上所有包含 ‘Confidential’ 的文件”
    “打开 Chrome,登录 Notion,截图发我”
    “检查 Git 状态并告诉我有没有未提交的改动”

    这些能力都来自 skills/ 目录下的插件,比如:

    • filesystem:读写本地文件
    • browser:控制浏览器
    • fetch:调用 API

    而且社区还在不断贡献新技能!它会越用越聪明,越用越能干💪


    ⚠️ 避坑 & 安全提醒(必看!)

    • 记忆存在哪? → 默认在 ~/.clawd,删掉就“失忆”
    • 网络要稳! → 需频繁调用 Claude / OpenAI API
    • 千万设白名单! → 不设 ALLOWED_USER_IDS = 全世界都能用你的机器人!API 费用爆炸不说,还可能被黑!

    🔒 它有键盘、有屏幕、有身份——一旦失控,后果严重!
    请务必当成“高危实验品”谨慎使用!


    💡 为什么它值得被关注?

    1️⃣ 最好的 UI,是没有 UI

    它不强迫你打开新 App,而是融入你现有的工作流。你在 Telegram 聊天,它就在 Telegram 回你;你在 Slack 开会,它就在 Slack 帮你查资料。

    未来 AI 或许不该是个“目的地”,而是一层智能图层

    2️⃣ 从“聊天”到“做事”

    传统 AI 只会回答问题,Clawdbot 却能执行任务——这才是真正的“Agent”。

    它不是靠模型多强,而是靠连接真实世界的工具生态The

    3️⃣ 数据在你手里

    所有记忆、配置都以 Markdown 存在本地,你可以随时查看、备份、删除。

    私密 + 透明 + 长效 = 真正属于你的 AI 助手!


    ❤️ 总结

    Clawdbot 没有炫酷的新算法,但它用产品思维回答了一个关键问题:

    用户不需要新入口,只需要把 AI 能力,无缝嵌入他们 already 在用的工具里。

    微信为什么难被取代?因为它够简单、够高频。
    与其造一个“全能 AI App”,不如让 AI 成为你现有生活的增强插件The

    而这,或许才是 AI 入口的终极形态。


    🔗 实用资源(收藏!)

  • 一周狂揽 5,513+ Star!开源版“Claude Cowork”横空出世,AI办公迎来新纪元

    最近,Anthropic 推出的 Claude Cowork 在科技圈掀起热议。这款专为工作场景打造的通用智能体,最令人震撼的不是其强大的功能,而是它的诞生过程——仅用10天时间,全部代码由 Claude Code 自动生成!
    有网友调侃:“Claude 版 Manus 只用了10天就‘搓’出来了,那小扎当年花140亿买 Facebook,是不是真成了冤大头?”

    Claude Cowork 到底有多强?

    永久记忆:AI 记住你的一切

    过去与 AI 对话,聊完即忘;如今,Claude Cowork 能长期保存你的工作习惯、项目背景,甚至未完成的文档草稿——真正实现“上下文感知”。

    Cowork 模式:不是聊天机器人,是数字同事

    它不再需要你反复追问。只需一句话下达任务:

    • “分析这份财报”
    • “给这个 PDF 加上电子签名”
    • “帮我排查这段代码的 bug”

    AI 自动拆解任务、执行流程,并将结果直接交付给你。开发团队更是“凡尔赛”地表示:从规划到落地,全程仅用10天,所有代码均由 Claude Code 自动生成。人类的作用?指点方向 + 验收成果。

    然而遗憾的是,Claude Cowork 仅对付费用户开放。正当大家为此惋惜时,开源社区再次展现了惊人能量。


    Eigent 出山:免费、本地、可魔改的“平替王者”

    就在 Claude Cowork 发布当天,一个名为 Eigent 的开源项目悄然上线,并宣布 100% 开源。不到一周时间,GitHub Star 数突破 5,513+,周增长速度仅次于 anomalyco/opencode 和 obra/superpowers,堪称“火箭式蹿升”。

    🔥 目前已收获 10.8k+ Star,成为全球首个开源多智能体工作流桌面应用。


    🌐 Eigent 基本信息一览

    项目内容
    GitHub 地址https://github.com/eigent-ai/eigent
    open source protocolEigent 开源许可证(基于 Apache 2.0 + 附加条款)
    技术栈– 后端:FastAPI + Uvicorn
    – 前端:React + Electron + TypeScript
    – 主要语言:TypeScript (61.7%)、Python (32.1%)、JavaScript (4.4%)
    构建框架基于知名开源框架 CAMEL-AI 打造

    ✨ Eigent 能做什么?

    Eigent 定位为 全球首个多智能体协作的工作流桌面应用。所谓“多智能体”,意味着你不再是雇佣一个 AI 助手,而是组建了一支 AI 团队

    • 开发 Agent:写代码、运行命令、调试 Bug
    • 搜索 Agent:全网爬取资料、提取关键信息
    • 文档 Agent:撰写报告、管理文件结构
    • 多模态 Agent:识别图像、处理音频

    💡 核心亮点:

    ✅ 支持自定义模型
    无论你是想用 Claude、GPT、还是本地部署的 Llama,都可以无缝接入。

    ✅ MCP 工具集成
    为 AI 团队配备“装备”:浏览器、Notion、Google Workspace、Slack,甚至企业内部 API 都能连接。

    ✅ 人工介入机制
    当 AI 遇到不确定情况时,会自动请求人类干预,避免“误删数据库”等灾难性操作。

    ✅ 覆盖日常办公刚需场景

    • 整理重复文件
    • 为 PDF 添加签名
    • 从银行流水生成报表
    • 行业调研、行程规划……

    🚀 上手超简单:三种部署方式任你选

    只要甩出需求,AI 团队就会自动拆解任务并执行。

    1️⃣ 云版本(推荐新手)

    直接访问 eigent.ai 注册账号,所有模型、API、存储均由官方托管。适合个人用户或小型团队快速试用。

    👉 点击注册 → 登录 → 开始使用,三步搞定。

    2️⃣ 自托管(社区版)——数据完全掌控

    适合对隐私要求高的用户或企业。安装极其简单,两条命令即可启动:

    git clone https://github.com/eigent-ai/eigent.git
    cd eigent
    npm install
    npm run dev

    ⚠️ 前提条件:

    • Node.js 18~22 版本
    • Python 环境(需运行 uv sync 更新依赖)

    👉 完整安装指南详见 README,手把手教学,小白也能轻松上手。

    3️⃣ 企业版(定制服务)

    需要 SSO 单点登录、权限控制、SLA 保障?直接联系商务团队,享受企业级支持和服务。

    🔍 与 Claude Cowork 对比:谁更胜一筹?

    对比维度Claude CoworkEigent
    价格付费订阅完全免费
    部署方式纯云端云 / 本地 / 混合部署
    数据隐私数据上传至 Anthropic完全自控,本地可控
    模型选择仅限 Claude支持任意模型自由切换
    定制能力封闭产品,不可修改代码全开源,支持深度魔改
    更新速度官方迭代社区驱动,响应更快

    ✅ 总结一句话:核心功能对标,但 Eigent 免费、本地、可魔改,性价比爆棚!

    ⚠️ Eigent 的不足之处

    尽管强大,Eigent 当前仍存在一些局限:

    • ❌ 暂无“永久记忆”功能,依赖模型原生上下文能力
    • ❌ 迭代速度快,但稳定性略逊于官方产品
    • ❌ 语音模式、Pixelate 等高级功能尚未完全同步

    不过,这些短板正在被快速补齐。得益于活跃的开源社区,Eigent 的发展速度远超预期。


    🏁 结语:开源再次证明“好东西应该人人有份”

    Anthropic 以 Claude Cowork 为 AI 办公开了个好头,但开源社区用实际行动告诉我们:真正的创新,不该被锁在付费墙后The

    Eigent 不仅功能硬核、增长迅猛,而且门槛极低。无论你是想体验 AI 办公的未来,还是希望在内网部署安全版本,甚至想要二次开发搞点新花样,它都能满足你。

    🌟 GitHub 地址https://github.com/eigent-ai/eigent
    🚀 立即尝试,开启你的 AI 协作新时代!


    📌 byword:AI办公、多智能体、开源项目、Claude Cowork、Eigent、CAMEL-AI、自动化工作流、AI助手、本地部署、隐私可控

  • 开源版Claude Cowork:打破官方垄断的免费桌面AI助手

    开源版Claude Cowork:打破官方垄断的免费桌面AI助手

    Anthropic近期推出的Claude Cowork研究预览版,宣称要为职场人配备“全能数字助理”。官方演示中,文件整理、周报生成、自动归档等功能令人眼前一亮,但门槛同样显著:仅限Claude Max高级会员仅支持macOS系统。Windows用户与非会员群体只能望洋兴叹。

    开源社区的反应却快如闪电——在官方发布后短短数日,GitHub已涌现名为Claude-Cowork的开源项目(由DevAgentForge团队主导),不仅复刻了“桌面AI助手”核心理念,更以完全开源免费的特性实现关键突破。

    一、核心定位:Claude Code的现代化GUI外壳

    1. 仅限终端操作,对非技术用户不友好
    2. 无多任务会话管理,窗口关闭即丢失上下文

    二、四大核心优势(技术解析)

    优势官方方案开源版Claude-Cowork用户价值
    交互方式黑色终端(Terminal)现代化桌面App✅ 实时流式输出(类ChatGPT体验)
    ✅ 代码智能高亮
    ✅ 状态可视化指示灯
    API兼容性仅限Anthropic模型支持第三方Anthropic协议模型💡 复用~/.claude/settings.json配置
    💡 国内开发者可接入本地/第三方模型(绕过网络限制)
    平台覆盖仅macOS全平台支持(macOS/Windows/Linux)⚙️ Electron框架实现跨平台
    💡 Windows用户可自行编译运行
    会话管理内置SQLite数据库会话管理🔁 多项目独立会话
    🔁 支持暂停/继续/切换

    三、典型应用场景

    场景实现能力
    编程开发全栈代码生成/重构、系统命令执行(测试/构建)
    文件管理创建/移动文件夹、目录结构优化
    知识问答基于本地代码库的精准问题解答

    四、使用指南(技术向

    前提条件:需预先安装并配置官方Claude Code(底层能力依赖)
    安装方式

    # 方案1:直接下载预编译安装包(Mac优先)
    # 方案2:源码编译(支持全平台)
    git clone https://github.com/DevAgentForge/Claude-Cowork.git
    cd Claude-Cowork
    bun install
    bun run dev  # 开发模式

    五、理性观察:开源版 vs 官方版

    维度官方Claude Cowork开源版Claude-Cowork
    核心定位非技术任务优化(Excel/文档)编程场景增强(基于Claude Code)
    模型依赖Anthropic专属模型第三方模型兼容(关键突破)
    用户价值通用办公助手开发者友好+打破垄断

    关键结论:

    虽非官方“全能助理”,但开源版精准切中开发者痛点——提供更友好的编程界面突破平台与模型限制,并显著降低AI操作门槛。对国内开发者而言,它不仅是“替代品”,更是绕过网络与支付壁垒的救命通道The

    结语

    当官方产品被平台与付费墙限制时,开源社区用代码证明:需求即动力。Claude-Cowork的诞生,不仅是一次技术复刻,更是对“AI工具民主化”的一次有力实践——无需等待,代码即解决方案。

  • Domestic Access without VPN! A Replacement for Manus Acquired by Meta for $2 Billion? This AI Agent Aipy is Awesome!

    Guys! Recently, the news that Meta acquired Manus for $2 billion has caused a stir, but it can't be used in China?

    Don't worry. Today, I'm going to recommend to you a domestic, free - of - charge local AI agent that doesn't require a VPN - Aipy! With features like open - source local operation, no - code needed, and multi - scenario practicality, it's simply an amazing AI labor - saving tool for ordinary people 🎉

    🌟 🌟 Full - fledged Core Advantages

    ✅ Domestic Access without VPN: Runs locally, no need for a scientific internet access method.

    ✅ Completely Free: Register and use the invitation code to get 3.5 million Tokens (Invitation code: 4zfb).

    ✅ Zero - code Operation: Describe your needs in plain language, and the AI will automatically generate/ execute code.

    ✅ Agent Marketplace: Tools for quantitative research, photo - editing, PPT generation, etc. can be installed with one click.

    Aipy integrates the large AI model with the Python program ecosystem. You don't need to know code at all. Just describe your needs in plain language, and it will automatically generate, debug, and execute programs in the background, and finally hand over the complete result to you.

    The interface of Aipy is very simple: Enter your needs in the chat box on the left, and the right side will run and display the results in real - time. You just need to say what you want to do, and it will automatically generate and execute code to complete the full loop from instruction to result.

    💡 Super - practical in Real - world Tests

    Quantitative Research: Free access to historical data of A - shares / US stocks / Hong Kong stocks. Enter the stock name and it will automatically generate a technical analysis report.

    Most stock analysis tools on the market require payment. However, Aipy has built - in historical market data of all listed companies in A - shares, US stocks, and Hong Kong stocks, and it can be used for free.

    Install "Quantitative Research" in the "Agent Marketplace", click "Go to Use", and just tell it which stock you want to analyze. It will give a comprehensive analysis result from multiple aspects such as technical indicators, valuation levels, and trend status.

    It should be emphasized that,the analysis given by the AI is more of a reference and learning tool, and the final investment decision still needs to be made by ourselves.

    Batch Photo - editing: Upload a photo folder and let the AI batch - edit photos with one sentence.

    First, install "Image Generation" in the "Agent Marketplace", and then click "Go to Use".

    Ask Aipy to batch - edit the puppies in the folder into the way I want. It can easily understand natural language without complex prompts.

    It completed my task in minutes without me writing any line of code. The generated pictures also have a very good effect.

    PPT Generation: One - sentence requirement + Internet search, and a well - structured PPT can be done in minutes.

    Aipy is also excellent at PPT generation. Just install "PPT Generation" in the "Agent Marketplace", click "Go to Use", and then state your requirement in one sentence.

    For example, if I want it to help me make an introduction to the Xiaomi 17Ultra, as a newly - released product, the AI knowledge base may not have relevant information. We can turn on the Internet search function to let it obtain real - time information.联网搜索的,让它去实时获取。

    After a while, a well - structured, complete - content, and clean - layout PPT is generated. From information organization to page presentation, it's done in one go, and the efficiency is remarkable.

    Material Download: Throw in the link and it can batch - download website pictures and automatically classify and name them.

    Aipy can also handle some more "hands - on" miscellaneous tasks. For example, to batch - download pictures from any website to the local device, just throw the link to it and state your needs.

    It will automatically handle the download, classification, and naming, and the obtained files still maintain their original clarity.

    If you're not satisfied with the results of some tasks, you can also manually select a more advanced model to execute.Aipy itself has multiple large models built- in and can switch flexibly according to different scenarios.

    In addition to the functions mentioned above, the Agent Marketplace of Aipy also integrates many practical tools such as short - video copywriting generation, browser control, contract review, video generation, resume screening, and enterprise information analysis, and related capabilities are still being continuously expanded.

    Aipy can not only help us think but also help us work. If you're looking for an AI tool that can accompany you in your work for a long time, Aipy is worth experiencing.

    🎁 Exclusive Benefits

    Do you want to experience the new - generation super AI assistant Aipy?

    Register now and fill in the invitation code 👉RPF2👈 to get 3.5 million Tokens for free!

    The usage method is as follows:

    ① Visit the Aipy official website: https://www.aipyaipy.com/, and download the latest version of the Aipy client.

    ② Fill in the above invitation code when registering and logging in.

    Official website:www.aipyaipy.com

    Open - source address:github.com/knownsec/aipyapp

  • Microsoft Copilot Upgrades to GPT-5.2 for Free! Expert - level Workflows Soar. Is It Even Better Than Professionals?

    Guys! Microsoft Copilot is making big news again 🎉 Today, it officially rolls out OpenAI's most powerful model, GPT - 5.2, and it's a free upgrade! This directly ushers in a new era of "expert - level" workflows, pushing office efficiency to the limit.

    🌟 Two Models Co - exist, and the Thinking - type is More Powerful

    GPT-5.2 and GPT - 5.1 are both available. The Plus version is a "thinking - type" variant - simply put, it's better at in - depth thinking! When dealing with tables, writing review codes, and processing long documents, it's incredibly fast. It can also handle complex tool calls and image analysis.

    🚀 Performance Doubles, Crushing Professionals

    In 44 professional task tests, GPT - 5.2Thinking was actually 70.9% superior to / on par with industry experts (previously, GPT - 5 was only 38.8%)! Whether it's creating PPTs, scheduling, or producing professional deliverables, it's more reliable than the consultants you hire, taking office automation to a new level.

    🔧 A Perfect Score in Rigorous Tests, Mastering Programming and Math

    • In the programming field: The SWE - Bench Pro test set a new record, far outperforming GPT - 5.1Thinking;
    • In math competitions: It got a perfect 100% score in AIME2025 and 92.4 points in the GPQA Diamond logic test;
    • In logic and science: There has been a significant improvement in CharXiv reasoning and ARC - AGI - 2, evolving from a basic assistant to a "digital intelligence entity".

    Now it can be used on web pages / Windows / mobile devices. Experience the power of expert - level AI for free! Have you guys tried Copilot's new features? Come and share your office efficiency tools in the comments section below 👇

  • The Copilot Usage Report 2025

    So as 2025 wraps up, we’ve gone headfirst into a mountain of de-identified data, searching for the quirks, surprises, and secret patterns that shape everyday life with Copilot. We’re finding out just how far it fits into people’s daily rhythms, and how human its uses have become: we often turn to AI for the things that matter most like our health. We analyzed a sample of 37.5 million conversations to find out how people actually use it out in the world.
    (Note: our system doesn’t just de-identify conversations; it only extracts the summary of the conversation, from which we learn the topic and the intent, and maintains full privacy.)

    From health tips that never sleep, to the differences between weekday and weekend usage, to February’s annual “how do I survive Valentine’s Day?” spike, our findings show that Copilot is way more than a tool: it’s a vital companion for life’s big and small moments. And if you’ve ever pondered philosophy at 2 a.m. or needed advice on everything from wellness to winning at life, you’re in good company. So has everybody else.

    Our work shows that AI is all about people, a trusted advisor slotting effortlessly into your life and your day. It’s about your health, your work, your play, and your relationships. It meets you where you are.
    Read all about it in our paper, but here are some of our takeaways.

    Health Is Always on Our Minds—Especially on Mobile

    No matter the day, month, or time, health-related topics dominate how people use Copilot on their mobile devices. Whether it’s tracking wellness, searching for health tips, or managing daily routines, our users consistently turn to Copilot for support in living healthier lives. This trend held steady throughout the year, showing just how central health is to our everyday digital habits. When it comes to mobile, with its intimacy and immediacy, nothing tops our health.

    Most common Topic-Intent pairing conversations, on mobile.

    Health is consistently the most common topic while interestingly, language-related chats peak earlier in the year, with entertainment seeing a steady rise.

    When Programming and Gaming Cross Paths

    August brought a unique twist: programming and gaming topics started to overlap in unexpected ways. Our data showed that users were just as likely to dive into coding projects as they were to explore games—but on the different days of the week! This crossover hints at a vibrant, creative community that loves to code during the week and play during the weekends in equal measure.

    August topic ranks for programming and games.

    There is a clear change in rank between programming and games through the days of the week, with programming rising from Monday to Friday, and Games shining on the weekends.

    February’s Big Moment

    February stood out for another reason: Copilot helped users navigate a significant date in their calendar year. Whether it was in preparing for Valentine’s day, or facing the day and the relationships, we saw a spike in activity as people turned to Copilot for guidance, reminders, and support. It’s a great reminder of how digital tools can make life’s important moments a little easier to manage.

    Ranking of “Personal Growth and Wellness” and “Relationship” conversations
    February brings concerns of personal growth before Valentine’s day, with a clear peak of relationship-related conversations on the day.

    Late-night Sessions

    The larger-than-life questions seem to have a rise during the early hours of the morning, with “Religion and Philosophy” rising through the ranks. Comparatively, travel conversations happen most often during the commuting hours.

    Average rank of Travel and Religion and Philosophy conversations per hour of the day. Whilst people have more travel-related conversations during the day, it’s in the early hours of the morning that we see a rise of Religion and Philosophy conversations.
    虽然人们在白天有更多与旅行相关的对话,但正是在凌晨时分,我们看到宗教与哲学对话有所增加。

    Advice on the Rise

    While searching for information remains Copilot’s most popular feature, we’ve seen a clear rise in people seeking advice—especially on personal topics. Whether it’s navigating relationships, making life decisions, or just needing a bit of guidance, more users are turning to Copilot for thoughtful support, not just quick answers. This growing trend highlights how digital tools are becoming trusted companions for life’s everyday questions.

    Why These Insights Matter

    By analyzing high level topics and intents, we manage to learn all these insights while keeping maximum user data privacy. Understanding these patterns helps us make Copilot even better. By seeing what matters most to our users—health, creativity, and support during key moments—we can design features that truly fit into their life. It’s also clear from these uses that what Copilot says matters. They show why it’s so important that we hold ourselves to a high bar for quality.

  • OpenAI Updates for Voice Developers

    OpenAI Updates for Voice Developers

    New audio model snapshots and broader access to Custom Voices for production voice apps.

    AI audio capabilities unlock an exciting new frontier of user experiences. Earlier this year we released several new audio models, including gpt-realtime, along with new API features to enable developers to build these experiences.

    Last week, we released new audio model snapshots designed to address some of the common challenges in building reliable audio agents by improving reliability and quality across production voice workflows–from transcription and text-to-speech to real-time, natively speech-to-speech agents.

    These updates include:

    The new snapshots share a few common improvements:

    With audio input:

    • Lower word-error rates for real-world and noisy audio
    • Fewer hallucinations during silence or with background noise

    With audio output:

    • More natural and stable voice output, including when using Custom Voices

    Pricing remains the same as previous model snapshots, so we recommend switching to these new snapshots to benefit from improved performance for the same price.

    If you’re building voice agents, customer support systems, or branded voice experiences, these updates will help you make production deployments more reliable. Below, we’ll break down what’s new and how these improvements show up in real-world voice workflows.

    Speech-to-speech

    We’re deploying new Realtime mini and Audio mini models that have been optimized for better tool calling and instruction following. These models reduce the intelligence gap between the mini and full-size models, enabling some applications to optimize cost by moving to the mini model.

    gpt-realtime-mini-2025-12-15

    gpt-realtime-mini model is meant to be used with the Realtime API, our API for low-latency, native multi-modal interactions. It supports features like streaming audio in and out, handling interruptions (with optional voice activity detection), and function calling in the background while the model keeps talking.

    The new Realtime mini snapshot is better suited for real-time agents, with clear gains in instruction following and tool calling. On our internal speech-to-speech evaluations, we’ve seen an improvement of 18.6 percentage points in instruction-following accuracy and 12.9 percentage points in tool-calling accuracy compared to the previous snapshot, as well as an improvement on the Big Bench Audio benchmark.

    Together, these gains lead to more reliable multi-step interactions and more consistent function execution in live, low-latency settings.

    For scenarios where agent accuracy is worth a higher cost, gpt-realtime remains our best performing model. But when cost and latency matter most, gpt-realtime-mini is a great option, performing well on real-world scenarios.

    For example, Genspark stress-tested it on bilingual translation and intelligent intent routing, and in addition to the improved voice quality, they found the latency to be near-instant, while keeping the intent recognition spot-on throughout rapid exchanges.

    gpt-audio-mini-2025-12-15

    The gpt-audio-mini model can be used with the Chat Completions API for speech-to-speech use cases where real-time interaction isn’t a requirement.

    Both new snapshots also feature an upgraded decoder for more natural sounding voices, and better maintain voice consistency when used with Custom Voices.

    Text-to-speech

    Our latest text-to-speech model, gpt-4o-mini-tts-2025-12-15, delivers a significant jump in accuracy, with substantially lower word error rates across standard speech benchmarks compared to the previous generation. On Common Voice and FLEURS, we see roughly 35% lower WER, with consistent gains on Multilingual LibriSpeech as well.

    Together, these results reflect improved pronunciation accuracy and robustness across a wide range of languages.

    Similar to the new gpt-realtime-mini snapshot, this model sounds much more natural and performs better with Custom Voices.

    Speech-to-text

    The latest transcription model, gpt-4o-mini-transcribe-2025-12-15, shows strong gains in both accuracy and reliability. On standard ASR benchmarks like Common Voice and FLEURS (without language hints), it delivers lower word error rates than prior models. We’ve optimized this model for behavior on real-world conversational settings, such as short user utterances and noisy backgrounds. In an internal hallucination-with-noise evaluation, where we played clips of real-world background noise and audio with varying speaking intervals (including silence), the model produced ~90% fewer hallucinations compared to Whisper v2 and ~70% fewer compared to previous GPT-4o-transcribe models.

    This model snapshot is particularly strong in Chinese (Mandarin), Hindi, Bengali, Japanese, Indonesian, and Italian.

    Custom Voices

    Custom Voices enable organizations to connect with customers in their unique brand voice. Whether you’re building a customer support agent or a brand avatar, OpenAI’s custom voice technology makes it easy to create distinct, realistic voices.

    Theese new speech-to-speech and text-to-speech models unlock improvements for custom voices such as more natural tones, increased faithfulness to the original sample, and improved accuracy across dialects. 

    To ensure safe use of this technology, Custom Voices are limited to eligible customers. Contact your account director or reach out to our sales team to learn more.

    From prototype to production

    Voice apps tend to fail in the same places, mainly on long conversations or with edge cases like silence, and tool-driven flows where the voice agent needs to be precise. These updates are focused on those failure modes—lower error rates, fewer hallucinations, more consistent tool use, better instruction following. And as a bonus, we’ve improved the stability of the output audio so your voice experiences can sound more natural.

    If you’re shipping voice experiences today, we recommend moving to the new 2025-12-15 snapshots and re-running your key production test cases. Early testers have confirmed noticeable improvements without changing their instructions and simply switching to the new snapshots, but we recommend experimenting with your own use cases and adjusting your prompts as needed.

  • Agentic AI is Coming: A New Opportunity for Enterprise Transformation!

    Guys, artificial intelligence has been constantly changing the way enterprises operate. In the past, the emphasis was on intelligent assistants, but they could only respond passively. Now, Agentic AI has arrived, and this is a major evolution 🔥!

    Traditional AI assistants can only perform isolated tasks and have limitations. However, Agentic AI can make autonomous decisions, coordinate multi - step actions, actively assess the environment, initiate actions, and coordinate cross - departmental work processes. It's really amazing 👏!

    For enterprise leaders, this brings both opportunities and responsibilities. It has great potential, but also poses significant challenges in terms of governance, trust, and design. Enterprises must be able to monitor and reverse the actions of Agentic AI.

    Enterprise work processes also need to be re - thought. We can no longer design processes step - by - step and insert automation. Instead, we need to build an intelligent ecosystem, consider which decisions should be made by humans and which by agents, and ensure correct data acquisition.

    A unified platform is extremely important at this time. Without it, agents may become disjointed. A unified approach can provide standards, achieve interoperability, reduce complexity, and enable large - scale implementation.

    Trust and accountability are also indispensable. Since agents act independently, the risks increase. Trust and accountability need to be integrated from the very beginning, with clear policies to make employees believe that it is a partner.

    Enterprises should measure the business value as early as possible and not let projects remain only at the pilot stage. Well - designed Agentic AI can bring exponential improvements and transform enterprise performance.

    The rise of Agentic AI is not about handing over power to machines, but a new stage of enterprise transformation where humans and agents fight side by side. Leaders should first conduct pilots and then expand, invest in a unified platform and policy framework, and foster a good culture.

    Hey everyone! AI agents are transforming businesses—now is the perfect time for business leaders to step up and shine 💪!

    Keywords

    #Agentic AI #Enterprise Transformation #Work Process Remodeling #Unified Platform #Trust and Accountability

  • The Battle of AI Assistants! Who Will Emerge as the King of "Winner - Takes - All"?

    Guys, the annual blockbuster report on the consumer - grade AI market recently released by a16z, a top venture capital firm in Silicon Valley, is really mind - blowing! 🔥 The competition in the general AI assistant track is extremely fierce right now. Users usually only choose one main product, and the "winner - takes - all" pattern is accelerating.

    The report shows that although the usage rate of AI has increased, users' willingness to use it across platforms is extremely low. Take ChatGPT's weekly active users as an example. Less than 10% of them will use other AI services simultaneously. Among mainstream products, only about 9% of users will pay for multiple assistants.

    Currently, OpenAI is still remarkable, leading with 800 - 900 million weekly active users. However, its "super - app" strategy faces challenges. Google, with its "experimental field" model, has made Gemini catch up rapidly. The number of desktop users has increased by 155% year - on - year, and the growth rate of paid subscriptions is nearly twice that of ChatGPT. 👏

    Judging from the data, ChatGPT has a leading user volume and high user stickiness. The ratio of daily active users to monthly active users is twice that of Gemini. But Gemini is growing at an astonishing rate, especially in terms of the growth of paid users, leaving ChatGPT far behind.

    In terms of product strategies, OpenAI is like building a "walled garden", stuffing various functions into ChatGPT, but this makes the interface more complex. Google, on the other hand, adopts the "experimental field" model, allowing innovative products to develop independently, but its products are a bit scattered.

    Other players also have their own unique skills. 👍 Anthropic's Claude focuses on technical users, and its programming assistant generates considerable revenue. Perplexity serves non - technical groups who value efficiency. Elon Musk's xAI product Grok is growing extremely fast, and its function iteration is also remarkable. It is said to be the AI product with the fastest - evolving capabilities.

    The key to the future competition of AI assistants lies in who better understands users' needs and can transform them into good business models. Guys, who do you favor more? 🤔

    #AI Assistant Competition #Winner - Takes - All #OpenAI #Google #Differentiated Breakthrough