The Experience Technology Department of Alipay, Ant Group, has officially open-sourced the intelligent programming assistant Neovate Code. It can deeply understand your codebase, follow the existing coding habits, and accurately complete function implementation, bug fixing, and code refactoring based on context awareness. It integrates the core capabilities required by Code Agent. GitHub:https://github.com/neovateai/neovate-code
At present, Neovate Code is provided in the form of a CLI tool, but its architecture is highly flexible and will support multiple client forms in the future to adapt to more development scenarios.
Its main functions include: Conversational development - A natural dialogue interface for programming tasks AGENTS.md rule file - Define custom rules and behaviors for your project Conversation continuation and resumption - Continue previous work across conversations Support for popular models and providers - OpenAI, Anthropic, Google, etc. Slash commands - Quick commands for common operations Output style - Customize the way code changes are presented Planning mode - Review the implementation plan before execution Headless mode - Automate the workflow without interactive prompts Plugin system - Extend functionality with custom plugins MCP - Model context protocol for enhanced integration Git workflow - Intelligent commit message and branch management …
Dear friends, there's new news about DeepSeek! The latest model, DeepSeek-V3.1-Terminus, has made its debut! 👏
This version comes in two modes: the thinking model and the non-thinking mode, both with a context length of 128k. It is an upgrade based on DeepSeek-V3.1 and has two major improvements. First, in terms of language consistency, it alleviates the mixing of Chinese and English and the occurrence of occasional abnormal characters. For example, the "extreme" character issue mentioned before has also been improved. Second, in terms of Agent capabilities, the performance of Code Agent and Search Agent has been further optimized, making them even more outstanding. DeepSeek's last update was on August 21st. It's only been a month, and the new model DeepSeek-V3.1-Terminus has outperformed Gemini 2.5 Pro in many evaluations.
However, in terms of benchmark performance, compared to DeepSeek-V3.1, it has only a slight overall upgrade, and there is a slight decline in some benchmarks. But in the Humanity's Last Exam benchmark, the improvement is huge, as high as 36.48%, jumping from 15.9 to 21.7. That's really amazing!
Now, DeepSeek-V3.1-Terminus has been launched on apps, web pages, and APIs.
Here are two addresses for you: Hugging Face 地址: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus
By the way, the word "Terminus" means "end". Does this imply that this is the last version of the V3 series and that DeepSeek - V4/R2 is coming soon? It's really exciting!
Dear friends, what do you think of DeepSeek-V3.1-Terminus? Come and share your thoughts in the comments section!
Dear friends, the team from the IPADS Laboratory of Shanghai Jiao Tong University has achieved something remarkable! They've launched a brand - new mobile - based intelligent agent toolchain called MobiAgent 🎉. This is truly extraordinary as it directly breaks down the barriers to developing personalized intelligent assistants, and it's said to outperform GPT - 5 and other top - tier closed - source models in real - world scenarios 👍.
MobiAgent is extremely powerful. It gives everyone the opportunity to create their own AI assistant. This toolchain enables users to build a mobile - based intelligent agent from scratch, covering the entire process from collecting operation data, training the model, to deploying it on mobile phones. Moreover, it's open - source, allowing users to obtain their own data, train the model, and use the intelligent assistant on their personal devices. It's so convenient 🥰.
To verify its capabilities, the research team conducted tests on 20 popular domestic applications. The results show that the 7B - scale MobiAgent model outperforms many well - known closed - source large - scale models in task completion scores, and it also leads among open - source GUI intelligent agents of the same scale 👏. Its unique "Latent Memory Accelerator" can learn from historical operations, helping the intelligent agent quickly complete repetitive tasks, with a performance improvement of 2 - 3 times.
The core of MobiAgent lies in its efficient data collection and intelligent training process. It uses lightweight tools to record users' mobile phone operations, then generates high - quality training data with a general VLM model, and through refinement and adjustment, the trained intelligent agent has excellent generalization ability. Its "brain" is divided into three parts: the "Planner" is responsible for task planning, the "Decision - maker" makes decisions based on the screen, and the "Executor" performs specific operations. This architecture makes model training more efficient and significantly improves the response speed 😎.
There's also the innovative AgentRR acceleration framework, which can leverage past operation experience to greatly improve the execution efficiency of repetitive tasks, with an action reuse rate of up to 60% - 85%. The intelligent assistant can handle daily affairs quickly and accurately.
The emergence of MobiAgent not only facilitates the customization of personal intelligent assistants but also promotes the development of the mobile intelligent agent ecosystem. It really seems that the intelligent era where "you can get things done by voice rather than hands" is coming 🤩.
Dear friends, are you looking forward to MobiAgent? Come and chat in the comments section 🧐
Paper address: https://arxiv.org/pdf/2509.00531
#MobiAgent #Shanghai Jiao Tong University #AI Assistant #Mobile - based Intelligent Agent #Open - source Toolchain #Performance Surpassing
Dear friends, OpenAI has made a big move again! Today, it was announced that the project function of ChatGPT is officially available to free users. This is simply amazing👏
This update comes with feature upgrades for different user groups. First, regarding the limit on the number of large files that can be uploaded. Free users can upload a maximum of 5 files per day. Plus users can upload 25 files, and Pro, business, and enterprise users can upload 40 files. This tiered design is extremely considerate. Whether your needs are big or small, you can find a usage method that suits you🥰
Moreover, OpenAI has added a lot of personalized setting features. Now users can customize the colors and icons of projects, which makes the management interface super personalized and can greatly improve work efficiency. For those who need to maintain context consistency, the newly added project - specific memory control function is really practical. It can better adapt to various conversation scenarios, making it easy and comfortable to manage information😎
This series of updates fully reflects OpenAI's attention to our users' needs. Whether you are an enterprise user or an individual user, with these new features, the experience of using ChatGPT becomes smoother.
It has to be said that this update by OpenAI is a major upgrade in user experience. The platform has become more attractive, and more users can equally enjoy the convenience brought by AI. In the future, ChatGPT will surely continue to be optimized. Let's look forward to more surprises together🤩
Dear friends, are you looking forward to these new features of ChatGPT? Come and chat in the comments section🧐
Dear friends, here's some big news! At 00:00 on September 1, 2025, the "Measures for the Identification of Artificial Intelligence-Generated and Synthesized Content" jointly formulated by multiple government departments officially took effect! 🎉 This measure puts forward regulatory requirements such as the mandatory addition of explicit and implicit identifications. From now on, AI-generated text, images, audio, and video must all show their "digital ID cards"🧐
Before this, many platforms such as Tencent, Douyin, Kuaishou, and Bilibili had already introduced detailed rules. Take Douyin for example, it has launched an AI content identification function and an AI content metadata identification reading and writing function, which help creators add prompt identifications and also provide technical support for content traceability👏
Now the ecological chain of AI-generated content has entered a stage of standardized management. Artificial intelligence is developing extremely rapidly. In 2024, the scale of China's artificial intelligence industry exceeded 700 billion yuan and has maintained a high growth rate year after year. However, the popularization of technology has also brought new risks. For example, there are more and more cases of it being used to create false news and carry out online fraud.
The core of the policy of the "Measures for the Identification" is the requirement of dual identifications. Explicit identifications should be "visible at a glance" to ordinary users. For example, add text explanations at the beginning and end of an article, or add voice prompts or special icons in audio and video. Implicit identifications, on the other hand, are to embed "hidden information" in the file metadata, including various key information.
This measure is of great significance. Professor Ren Kui, one of the drafters, said that it is the first time to include generation service providers, content dissemination platforms, and end users in a unified governance framework, forming a system progression with other regulations and clarifying the boundaries of responsibility. It can promote the standardized development of the AIGC industry, reshape the public's trust in AIGC technology, and also enhance China's voice in the field of artificial intelligence security governance, providing a model for global content governance👍
Let's talk about the dual identification system again. Explicit identifications should be directly perceived by users. Texts should mark words such as "generated by artificial intelligence" in specific positions, and the font should be clear. Implicit identifications focus on technical traceability, embedding metadata inside the file, containing various key information. There are clear labeling requirements for different types of AI-generated content.
The "Measures for the Identification" also encourages the use of AI for original content creation. Moreover, it clarifies the obligations of different entities at the legal level. Service providers need to ensure that the content meets the identification requirements. Dissemination platforms need to verify implicit identifications and add significant prompt identifications. Application distribution platforms need to verify the identification functions of service providers.
However, the implementation of this measure also faces challenges. Users may delete explicit identifications or avoid implicit ones through transcoding, making it difficult to accurately identify the content posted by malicious users. Lawyers suggest that content publishing platforms should assume more responsibilities. Professor Ren Kui suggests from a technical perspective the development of secure content implicit identification technology.
All in all, identification is a crucial step in the governance of AI-generated content. But to truly avoid risks, it is also necessary to refine laws and regulations, establish industry self-discipline standards, strengthen law enforcement efforts, and enhance international cooperation. Cross-border AIGC law enforcement is also a challenge. In the future, it is necessary to promote the coordination of technical identifications and establish cross-border law enforcement mutual assistance mechanisms. Dear friends, what do you think about the mandatory "labeling" of AI-generated content? 🤔
#AI-generated content #Mandatory labeling #Content security governance #Dual identification system #Main body responsibility #Supervision challenges
On the evening of August 19th, DeepSeek officially announced that the online model version has been upgraded to V3.1. The most significant improvement is that the context length has been extended to 128K, which is equivalent to being able to process super-long texts of 100,000 to 130,000 Chinese characters, suitable for long document analysis, code library understanding and multi-round dialogue scenarios.
Users can now experience the new version through the official website, App or WeChat mini-program. The API interface call method remains unchanged, and developers can switch seamlessly without additional adjustments.
This upgrade is not a major version iteration, but an optimization of the V3 model. Tests show that V3.1 has a 43% improvement in multi-step reasoning tasks compared to the previous generation, especially more accurate in complex tasks such as mathematical calculations, code generation and scientific analysis. Meanwhile, the situation of the model's "hallucination" (generating false information) has decreased by 38%, and the output reliability has been significantly enhanced. In addition, V3.1 has also optimized multilingual support, especially improving the processing ability of Asian languages and less common languages.
Although V3.1 brings important improvements, the release time of the next-generation large model DeepSeek - R2, which users are more looking forward to, is still uncertain. Previously, there was market speculation that R2 would be released from August 15th to 30th, but insiders close to DeepSeek said that this news is not true and the official has no specific release plan at present.
DeepSeek's update rhythm indicates that the V4 model may be launched before the release of R2. However, the official has always been low-key, emphasizing that "it will be released when it's done" and has not responded to any market speculation.
Recently, the news of the release of DeepSeek's next-generation large model DeepSeek - R2 has attracted widespread attention in the market. There is a rumor that DeepSeek - R2 will be released between August 15th and 30th. However, according to Tencent Technology, sources close to DeepSeek have confirmed to the media that this news is not true and DeepSeek - R2 has no release plan this month.
As early as the beginning of this year, news about the R2 model had already started to spread. At that time, it was predicted that the R2 model would be released on March 17th, but this claim was also denied by the official. So far, DeepSeek has not officially announced the specific release time and technical details of the R2 model, which has disappointed many observers.
According to reports, the DeepSeek team stepped up the development of the R2 model in June this year. Insiders revealed that CEO Liang Wenfeng is still not satisfied with the capabilities of the model, and the team is still improving its performance and is not ready for official use. Early news said that DeepSeek originally planned to launch the R2 model in May, but due to various reasons, the plan was delayed. The new model is expected to be able to generate higher quality code and have the ability to reason in non-English languages.
On August 7, 2025, OpenAI officially released the GPT-5 series of models, which represents the most significant product upgrade in the company's history. This release includes four versions: GPT-5, GPT-5 Mini, GPT-5 Nano, and GPT-5 Pro, each deeply optimized for different application scenarios, marking a new stage of development for AI technology.
Unified Intelligent System: A Revolutionary Breakthrough in Technical Architecture GPT-5 is positioned by OpenAI as a "unified intelligent system", successfully integrating capabilities that were previously scattered across different models: the multimodal processing of GPT-4o, the deep reasoning of the o series, advanced mathematical calculation, and agent task execution. This architectural innovation eliminates the need for users to manually switch between different models. The system automatically selects the most suitable processing method based on task complexity through a real-time router.
In terms of core technical indicators, GPT-5 has achieved a comprehensive breakthrough:
Mathematical Reasoning: Achieved an accuracy rate of 94.6% in the AIME 2025 benchmark test without the need for external tools. Code Capability: Scored 74.9% in the SWE-bench Verified test and 88% in the Aider Polyglot multilingual programming test. Multimodal Understanding: Scored 84.2% in the MMMU benchmark test. Professional Knowledge: Scored 88.4% in the GPQA general question answering test. Detailed Analysis of the Four Versions
GPT-5 (Flagship Version): The Strongest Reasoning and Multimodal Capabilities As the flagship product of the series, GPT-5 is designed for complex tasks and possesses the following core features:
Breakthrough in Reasoning Ability: Built-in Chain-of-Thought technology, which can decompose complex problems and solve them step by step. In internal tests, GPT-5 outperformed all previous models in complex tasks in over 40 professional fields.
Comprehensive Multimodal Support: Supports text, image, speech, and video processing, inheriting Sora's video generation technology. Users can upload content in various formats, and GPT-5 can generate corresponding responses or perform compound tasks, such as analyzing medical images or real-time translation of video content.
Agent-Based Task Execution: Supports complex operations such as automatic web browsing, generating complete software applications, and managing schedules. In the launch demonstration, GPT-5 generated a complete French learning web application with flashcards, quizzes, and progress tracking functions in just a few seconds based on a simple description.
Significant Reduction in Hallucination Rate: Through the "safe completion" technology, GPT-5's factual error rate is approximately 45% lower than that of GPT-4o, and when using the reasoning mode, the error rate is approximately 80% lower than that of the o3 model.
GPT-5 Mini: A Cost-Effective Lightweight Option
GPT-5 Mini is optimized for cost-sensitive applications, significantly reducing resource requirements while retaining core functions:
Supports chain reasoning tasks of moderate complexity. Has text, image, and speech processing capabilities, with relatively limited video processing functions. Can run on devices with lower computing power, making it suitable for small and medium-sized enterprises and individual developers. tasks in over 40 professional fields. Significant Reduction in Hallucination Rate: Through the "safe completion" technology, GPT-5's factual error rate is approximately 45% lower than that of GPT-4o, and when using the reasoning mode, the error rate is approximately 80% lower than that of the o3 model.
GPT-5 Mini: A Cost-Effective Lightweight Option
GPT-5 Nano is optimized for speed and low resource consumption, being the lightest version in the series:
Extremely low-latency response, designed specifically for real-time applications. Can run on devices with only 16GB of memory, including MacBook or low-end servers. Relatively simplified reasoning ability, mainly used for quick interaction and simple tasks. Performs comparably to the o3-mini in general benchmark tests. Applicable scenarios include mobile device applications, embedded systems, real-time translation, voice assistants, and other scenarios with high requirements for response speed.
GPT-5 Pro: Enhanced Version for Professional Users GPT-5 Pro is a high-performance version designed for high-end users and enterprises:
Enhanced Reasoning Mode: Supports the "GPT-5Thinking" function, enabling in-depth reasoning on complex problems for a longer time to ensure extremely high accuracy.
Unlimited Access: Pro users have unlimited access to GPT-5 and exclusive access to GPT-5 Pro.
Professional Multimodal Capabilities: Performs excellently in tasks such as video processing and complex image analysis, scoring 46.2% in the HealthBench Hard medical benchmark test.
Deep Tool Integration: Seamlessly integrates professional tools such as search, Canvas, and code execution, providing a complete workflow experience.
Pricing Strategy: The Largest-Scale Free Release in History OpenAI has adopted an unprecedented open strategy, providing GPT-5 access to all user groups:
Free Users: Can use GPT-5 and GPT-5 Mini with usage limits. Once the limit is exceeded, the system will automatically switch to the Mini version.
Plus Users ($20/month): Enjoy higher usage limits, suitable for individual users and small teams.
Pro Users ($200/month): Have unlimited access to GPT-5 and GPT-5 Pro and can use the "GPT-5Thinking" mode.
Enterprise and Education Users: Will gain access within one week after the release and can use the GPT-5 Pro version.
API Pricing: $1.25 per million tokens for input and $10 per million tokens for output, targeted at professional developers.
Comprehensive Upgrade of User Experience The GPT-5 series brings several user experience innovations:
Intelligent Model Selection: The system automatically selects the most suitable model version based on task complexity and user intent, eliminating the need for users to manually switch.
Personalized Interaction: Offers four preset personalities (Cynic, Robot, Listener, Nerd) and custom chat color options.
Enhanced Memory Capacity: Larger context windows can remember longer conversation histories, providing a more coherent interaction experience.
User-Friendly Design: Compared to GPT-4o, the new model reduces overly flattering expressions and uses fewer unnecessary emojis, making the interaction more natural.
Technical Architecture Innovation The GPT-5 series may adopt a Mixture of Experts (MoE) architecture, significantly improving efficiency by reducing the number of active parameters. The training data is mainly in English text, focusing on the fields of STEM, programming, and general knowledge, with the knowledge cutoff date being June 2024. The entire training process was completed on NVIDIA H100 GPUs, consuming approximately 2.1 million GPU hours.
Competitive Advantages and Market Impact In the current highly competitive AI environment, the release of GPT-5 is of great strategic significance. Facing strong competitors such as Anthropic Claude3.5Sonnet, xAI Grok4, and Google Gemini2.5Pro, OpenAI is consolidating its market position through a free opening strategy and a significant reduction in the hallucination rate.
According to statistics, there are currently 5 million paid users of ChatGPT's commercial products, including well-known institutions such as BNY Mellon, California State University, Figma, Intercom, and Morgan Stanley. The release of GPT-5 is expected to further accelerate the adoption of AI in enterprises and promote the digital transformation of various industries.
Industry Outlook and Challenges The release of the GPT-5 series represents a new milestone in the development of AI technology, but it also faces some challenges:
Privacy and Security: Multimodal capabilities involve the processing of sensitive data such as medical images and personal conversations, making data protection a key issue.
Technical Impact: The increase in automation may have an impact on traditional job positions, requiring social adaptation and adjustment.
Performance Verification: Although OpenAI claims that GPT-5 possesses "doctoral-level intelligence", the performance of its real reasoning ability in practical applications still needs time to be verified.
Conclusion The release of the GPT-5 series marks another major breakthrough for OpenAI in the field of AI. Through the differentiated layout of the four versions, OpenAI has successfully covered the entire spectrum of needs from individual users to corporate customers. This is not only a technological upgrade but also a comprehensive innovation in AI product strategy.
As GPT-5 becomes the new default model for ChatGPT, replacing previous versions such as GPT-4o and o3, users only need to open ChatGPT and enter questions, and the system will automatically process them and apply reasoning functions when necessary. The realization of this seamless experience indicates that AI technology is rapidly evolving from being a tool to an assistant, and from being auxiliary to collaborative.
At 1 a.m. (Beijing time) today, OpenAI officially released the much-anticipated GPT-5, claiming it to be the most powerful and practical AI system to date. Compared with the previous models, GPT-5 has the following major improvements: significantly enhanced capabilities in scenarios such as programming, mathematics, writing, health Q&A, and visual perception; a substantial reduction in hallucinations; stronger instruction-following capabilities; and a significant decrease in obsequious and flattering responses.
GPT-5 is open to all users. Plus subscribers have more usage quotas, and Pro subscribers can use GPT-5 Pro, which has deeper reasoning capabilities and can provide more comprehensive and accurate answers.
GPT-5 no longer distinguishes between traditional reasoning models, multimodal models, and Agent models. Instead, it integrates these capabilities under a unified architecture. The real-time router will automatically determine which model to call based on the type of conversation, the difficulty of the question, the need to call tools, and explicit user instructions (such as "Please think carefully", etc.).
Highlighted Capabilities of GPT-5: Programming Capability: It is the most powerful code model to date, excelling in complex tasks such as front-end page generation and large code library debugging. It can generate complete, beautiful, and responsive websites/apps/games with a single round of prompting, and has an enhanced understanding of design principles such as layout, making it more suitable for developers' needs.
Creative Writing: It can transform rough ideas into texts that are structurally complete, have literary depth, and a natural rhythm. It is good at handling writing with ambiguous structures or complex forms, performs well in daily writing tasks, and is more empathetic.
Health Q&A: Its understanding in health scenarios has been greatly improved. It is the best-performing model in the HealthBench assessment. It can provide accurate, reliable, and practical health information based on various factors, actively identify potential risks, guide rational judgment, and is suitable for assisting in decision-making but does not replace medical professionals.
Innovation in Security Mechanism: It has shifted from "refusing to answer" to "safe generation" and introduced the "Safe-completion" mechanism, which can more carefully handle dual-use questions. For questions like "What is the minimum energy required to light a firework?", it will give reasonable and practical answers on the premise of ensuring safety.
#GPT5 #OpenAI #AI system #GPT5 capabilities #Security mechanism #Model upgrade
Guys, Xiaomi is making big moves again! 👏 Today, Xiaomi officially released and fully open-sourced the MiDashengLM-7B multimodal large model. This is an AI model focused on audio understanding, and it has made super significant breakthroughs in terms of performance and efficiency. 🎉
Let's talk about the technical architecture first. 🧐 It adopts an innovative dual-core architecture design, using Xiaomi Dasheng as the audio encoder and combining it with Qwen2.5 - Omni - 7B Thinker as the autoregressive decoder. This design skillfully combines professional audio processing capabilities with powerful language understanding capabilities, laying a technical foundation for the model's excellent performance. Moreover, its biggest highlight is the general audio description training strategy, which breaks the limitation of traditional audio AI models that only focus on single sound processing. It can uniformly understand speech, environmental sounds, and music. Such all-domain audio understanding ability is really rare in the industry. 👍
In terms of performance, it's even more impressive. ✨ It has set new best records for multimodal large models on 22 public evaluation datasets, which is enough to prove its leading technical position in the field of audio understanding. The improvement in reasoning efficiency is also extremely dramatic. The first token latency of single-sample reasoning is only a quarter of that of advanced industry models. Under the same video memory conditions, the data throughput efficiency is more than 20 times higher than that of advanced industry models. This benefits from Xiaomi's technical accumulation in model architecture optimization and training strategy improvement, reducing computational overhead while maintaining high accuracy. 👏
MiDashengLM - 7B is an important upgraded version of Xiaomi's Dasheng series of models. The Xiaomi Dasheng audio encoder has gone through several generations of technical iteration and optimization and already has a mature technical system. The new model has been comprehensively upgraded based on the previous one, greatly improving the accuracy of audio understanding and computational efficiency. 🥳
The future plan is also very promising. 😆 Xiaomi is already further upgrading the computational efficiency of this model, with the goal of achieving offline deployment on terminal devices. This means that users can enjoy high-quality audio AI services without relying on cloud services, with better privacy protection and lower usage costs. It can also provide technical support for Xiaomi's audio AI applications in the IoT ecosystem. In addition, Xiaomi is also improving the sound editing function based on users' natural language prompts. In the future, complex audio processing tasks can be completed through simple text descriptions, greatly reducing the technical threshold of audio editing. 🤩
Xiaomi's choice to fully open-source MiDashengLM - 7B is really meaningful. 👏 This can promote the technological progress of the entire audio AI field and provide good opportunities for researchers and developers to learn and improve. Open sourcing can accelerate the popularization and application of audio AI technology, enable more innovative applications to emerge, and promote the prosperous development of the industry ecosystem. 🎉
Guys, it seems that a new era of audio AI is coming. What do you think of this MiDashengLM - 7B? 🧐 Come and let's chat in the comments section. 😜
#Xiaomi #MiDashengLM7B #Audio AI #Open Source Model #Multimodal Large Model #Audio Understanding #Technical Breakthrough #Inference Efficiency