Recently, the field of artificial intelligence has been swept by a strong storm again, setting off a "tsunami" in the global technology community!
A super new star, DeepSeek, has emerged with lightning speed, quickly set off a storm around the world, causing a strong shock in the global AI field.
According to third-party statistics, the number of daily active users of DeepSeek exceeded 20 million in just 20 days after its launch, and its daily active growth rate has exceeded that of ChatGPT, which was popular at the time.
According to the data, DeepSeek, whose full name is "Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.", was established in 2023 and mainly focuses on developing advanced large language models (LLM) and related technologies. Since its establishment, DeepSeek has developed rapidly and continuously launched impressive large models. It has successively released open source code large model DeepSeek Coder, general large model DeepSeek LLM, open source mixed expert (MoE) model DeepSeek-V2, etc.
At the end of 2024, DeepSeek released the latest generation of large language model-DeepSeek-V3, which adopts an innovative MoE architecture and has 671 billion total parameters, but only 37 billion parameters are activated each time, and the training cost is only US$5.576 million; and it ranks among the best in code, logical reasoning and mathematical reasoning capabilities.
On January 20 this year, DeepSeek once again released its new inference model DeepSeek-R1, which not only achieved performance benchmarking with the latest official version of OpenAI o1, but also caused industry shock with its full-stack open ecological layout.
In just over a year, DeepSeek has grown from a startup to the focus of the global AI field, allowing the world to see the innovative power and unlimited potential of Chinese AI. And with the explosion of DeepSeek, people can't help but wonder, why has it set off such a big wave in the AI field where there are so many experts?
DeepSeek, why is it so popular?
The reason why DeepSeek can stand out in the fiercely competitive AI field is inseparable from its unique technical advantages and breakthroughs, which have built a strong technical barrier for DeepSeek.
01 Algorithm optimization: making AI more "smart"
At the algorithm level, DeepSeek is unique. Traditional AI model training often relies on a lot of computing power and data, following the model of "great effort to create miracles", which is costly and inefficient. DeepSeek breaks the conventional thinking through innovative architecture.
DeepSeek uses a hybrid expert architecture (MoE) to improve computing efficiency and model accuracy, and reduce computing resource consumption. The uniqueness of the MoE model is that it is like a think tank with many expert consultants, and each "expert" has his or her own specific task area. When a user asks a question, the model can intelligently deploy the most appropriate "expert" to provide a solution, thereby significantly improving processing efficiency and accuracy, and effectively avoiding unnecessary consumption of computing resources.
In addition, DeepSeek also uses the multi-head potential attention (MLA) mechanism during training. Compared with the traditional attention mechanism, MLA can more accurately capture the key information in the text, thereby improving the model's understanding and processing capabilities for complex tasks, and the efficiency is also greatly improved.
02 Low cost: making AI more "people-friendly"
In terms of training costs, DeepSeek has excellent performance. According to relevant data, DeepSeek's model training cost is only $5 million, which is about 1/20 of similar products; the operating cost is $0.55 per million tokens input (OpenAI is $15), and the output cost is only $2.19 per million tokens (OpenAI is $60). This low-cost advantage enables more companies and developers to afford the research and development and application of AI technology.
DeepSeek's cost advantage is due to the algorithm optimization mentioned above, which enables efficient training even with less computing resources; on the other hand, DeepSeek has made fine optimizations in data processing, minimizing unnecessary data storage and transmission costs, reducing overall operating costs.
03 Multimodal fusion: making AI more "all-round"
DeepSeek also performs well in multimodal capabilities. DeepSeek has strong cross-modal learning capabilities and can effectively integrate data from multiple modalities such as text, images, and voice to achieve more powerful interactions and applications. Its multimodal version DeepSeek-R1 has powerful cross-modal penetration and fusion perception capabilities, and can achieve efficient reasoning and collaborative output of multiple modalities by combining world knowledge and contextual learning capabilities. This enables DeepSeek to be applied in richer scenarios, such as content creation, intelligent customer service, education and other fields, through multimodal interaction, so as to obtain more comprehensive and vivid information and experience.
In addition, it is worth mentioning that DeepSeek also adopts a completely open source model, allowing developers to freely use, modify and optimize its code. This open strategy not only lowers the threshold for use, but also promotes collaboration and innovation in the global AI developer community.
Will DeepSeek detonate the AI hardware track?
With its unique and advanced technical path, lower cost, higher model performance and open source strategy, DeepSeek has successfully attracted the attention and attention of the global technology circle, which has had a profound impact on the global AI market competition pattern and brought a "catfish effect" to the AI industry.
Since DeepSeek became popular, major technology giants have taken action quickly, including overseas technology giants such as Microsoft and NVIDIA, as well as domestic technology manufacturers such as Alibaba Cloud, Huawei Cloud, Tencent Cloud, Baidu Cloud, and 360 Digital Security, which have announced access to DeepSeek's large model to seize the traffic of DeepSeek's large model.
In terms of market ecology, DeepSeek's technological innovation will further promote the application of AI technology in all walks of life. Guotai Junan Research Report stated that the launch of DeepSeek R1 reflects the speed of technological progress under the open source paradigm, as well as the possibility of a significant reduction in the cost of AI training and reasoning, and the widespread implementation of AI is expected to accelerate.
In the past two years, under the burning of the AI large model war, the field of artificial intelligence has ushered in an unprecedented prosperity, and the field of "AI+hardware" has also rapidly risen and become a hot track that has attracted much attention in the industry. Looking back at 2024, the integration and application of AI technology and terminal devices has reached an unprecedented depth and breadth. From mobile phones, laptops to wearable devices (AI glasses, AI rings, AI headphones, etc.), and even hardware products such as toys, learning machines, and companion robots, all have achieved a leapfrog upgrade in functionality and brought about a new round of explosion in the product power of terminal devices.
Although the development trend of the "AI+hardware" market is good, it also faces many challenges. At the technical level, the accuracy and stability of AI algorithms still need to be improved; at the market level, product homogeneity is serious, and many AI smart hardware products lack innovation in design and function; in terms of price, some high-end AI hardware is currently too expensive, which limits the popularity of the market.
The emergence of DeepSeek is expected to bring new opportunities for the development of AI smart hardware. First of all, it plays an important role in reducing hardware costs. Through DeepSeek's powerful algorithms and efficient processing capabilities, these large AI models can run more efficiently on hardware, reduce excessive dependence on hardware resources, and reduce hardware energy consumption and maintenance costs. This not only allows consumers to enjoy the convenience brought by AI smart hardware at a lower price, but also provides AI smart hardware manufacturers with greater profit margins.
In addition, the emergence of DeepSeek has cast a "seed of innovation" in the field of AI hardware, inspiring hardware manufacturers to actively explore new technical solutions and application scenarios. Based on DeepSeek's powerful multimodal capabilities, the hardware can achieve low power consumption, high-speed reasoning and deep interaction when running some complex AI tasks, such as real-time image recognition and natural language processing. Taking AI glasses as an example, with the support of DeepSeek technology, the interactive experience of AI glasses is expected to be greatly improved. It can more accurately identify the user's voice commands, respond quickly and provide accurate information, making the user feel like having a personal smart assistant.
It can be said that DeepSeek has injected new vitality into the market with its unique technical advantages and low-cost strategy. It is foreseeable that in the future, more and more new entrants will see the potential of the AI hardware market, join the competition, and launch AI hardware products based on DeepSeek technology. The innovative product concepts and business models brought by these new entrants will also make market competition more diversified.
This paper is from Ulink Media, Shenzhen, China, the organizer of IOTE EXPO (IoT Expo in China)