2024CTIS-文章详情页顶部

A Glimpse of Chinese Internet Giants' 50 AI Models and Applications

The Chinese technology landscape has seen significant advancements in AI development, with major companies releasing a multitude of models and applications in various domains such as language, image, audio, and video generation.

A Glimpse of Chinese Internet Giants' 50 AI Models and Applications

(TMTPOST)— A global wave of artificial intelligence (AI) in recent years, fueled by the popularity of OpenAI’s ChatGPT, has prompted major technology companies in China, such as Alibaba, Baidu, ByteDance, Tencent, Huawei, Xiaohongshu, Meitu, iFlytech, and Qihoo 360, to develop their own large models. Some homegrown innovations are as follows:

Alibaba

AtomoVideo – touted as China's "Sora" for Video Generation

AtomoVideo, introduced by Alibaba in March 2024, is a high-fidelity image and video generation framework. Similar to OpenAI's Sora, AtomoVideo utilizes multi-granularity image injection technology, providing higher fidelity for generated videos based on a given image. The architecture of AtomoVideo exhibits flexibility, extending to video frame prediction tasks, ensuring excellent performance in handling long-sequence video prediction tasks.

EMO - AI Image-Audio-Video Model

EMO, another Alibaba creation, focuses on generating expressive portrait videos directly from images and audio. It stands out by generating AI videos that synchronize with given images and audio inputs, enabling users to create dynamic and expressive videos with fluent facial expressions. This model caters to various content creation needs, including speeches, e-commerce livestream, and video content creation.

Qwen-VL-Max - Multi-Modal Large Model Comparable to GPT-4

Introduced in January 2024, Qwen-VL-Max by Alibaba is an open-source multi-modal visual model. With capabilities comparable to GPT-4V and Gemini Ultra, this model excels in accurate image description, information reasoning, and extended creative tasks based on images.

Tongyi Qianwen, Tongyi Wanxiang, and Tongyi Tingwu

Alibaba's Tongyi Qianwen, an AI language model, serves as a smart Q&A assistant. The 2.0 version, released on October 31, 2023, exhibits significant improvements in complex instruction understanding, literary creation, general mathematics, knowledge retention, and illusion resistance.

Tongyi Wanxiang assists in image creation for artistic purposes, employing a combination of generative models to provide highly controllable and diverse image generation effects.

Tongyi Tingwu, an AI assistant, utilizes AI models for both language and audio-visual tasks, enhancing information production, organization, mining, and insight in the general audio-visual content domain.

Baidu

UniVG - Unified Modality Video Generation System

UniVG, unveiled by Baidu in January 2024, is a unified modality video generation system. Its unique feature lies in adopting different generation methods for high and low freedom tasks, balancing the relationship between the two. UniVG generates smooth and coherent videos from a single image or text prompt, showcasing stability and coherence in each frame compared to early AI video generation tools.

ERNIE Bot, Wenxin Yige, and Wenxin Qianfan

Baidu's Wenxin large model series, initiated in 2019, is a natural language processing model based on the ERNIE series. The 4.0 version, released in October 2023, marked a comprehensive upgrade in fundamental capabilities, aligning with the performance standards set by GPT-4.

ERNIE Bot, akin to Alibaba's Tongyi Qianwen, serves as a generative AI product for various Q&A interactions.

Wenxin Yige, an AI art creation platform, generates diverse AI creative images to aid in creative design.

Wenxin Qianfan serves as Baidu's enterprise-level large model production platform, providing services and tools for large model development and application.

ByteDance

SDXL-Lightning - ByteDance's Version of DALL·E for Text-to-Image Generation

SDXL-Lightning, developed by ByteDance, is an open-source text-to-image generation model, swiftly producing high-resolution images based on textual prompts. Its notable improvement lies in the accelerated generation speed, achieving text-to-image generation at 1024px resolution in minimal steps.

Meitu

MiracleVision - Meitu's AI Vision Large Model

Meitu's AI vision large model, MiracleVision, launched its closed beta in June 2023. Boasting powerful visual expressive and creative capabilities, it supports multiple renowned Meitu products. As of version 4.0, MiracleVision contributes to Meitu's product ecosystem, extending its influence to various industries, including e-commerce, advertising, gaming, animation, and film.

iFlytech

Xinghuo Yuyin - iFlyTek's AI Speech Model

Xinghuo Yuyin, iFlyTek's AI speech model, unifies recognition, translation, and multi-language classification tasks. This model excels in improving speech recognition accuracy and features super-human speech synthesis capabilities, closely mimicking natural human speech patterns.

Tencent

M2UGen - Multi-Modal Music Generation Model

M2UGen, Tencent's multi-modal music generation model, combines music understanding and generation capabilities, aiding users in artistic music creation. It supports music generation from text, images, videos, and audio inputs, offering users the ability to edit the generated music easily.

AnimateZero - Tencent's AI Video Generation Model

AnimateZero, released by Tencent's AI team, is an AI video generation model enhancing the precision of video appearance and motion through improved pre-trained video diffusion models. Users can generate videos by inputting text and images, creating dynamic and detailed content.

Xiaohongshu (Red, a social e-commerce app)

Hongshu Zhiyu - Xiaohongshu Text Generation Tool

Hongshu Zhiyu, introduced by Xiaohongshu, is an AI tool for automatically generating Xiaohongshu-style captions based on image content. The tool incorporates a text generation feature, a database of 15 million Xiaohongshu-style captions validated by users, and customizable options for tailoring captions based on individual preferences.

Qihoo 360

AIJi - 360's AI Conversational Agent

AIJi, launched by Qihoo 360, is an AI conversational agent designed for user interaction. It boasts comprehensive language understanding and generation capabilities, supporting natural and diverse conversations with users.

The Chinese technology landscape has seen significant advancements in AI development, with major companies releasing a multitude of models and applications in various domains such as language, image, audio, and video generation. These developments not only showcase the competitiveness of Chinese tech giants but also contribute to the global AI landscape, fostering innovation and pushing the boundaries of what AI can achieve.

转载请注明出处、作者和本文链接
声明:文章内容仅供参考、交流、学习、不构成投资建议。
想和千万钛媒体用户分享你的新奇观点和发现,点击这里投稿 。创业或融资寻求报道,点击这里

敬原创,有钛度,得赞赏

赞赏支持
发表评论
0 / 300

根据《网络安全法》实名制要求,请绑定手机号后发表评论

登录后输入评论内容

快报

更多

16:00

海峡创新:收到《行政处罚事先告知书》 公司股票5月13日停牌一天

15:58

欧盟27国就对乌克兰安全承诺草案达成共识

15:56

中科蓝讯:2023年拟10派8.3,股权登记日5月16日

15:41

特发信息:收到行政处罚及市场禁入事先告知书

15:35

美国邮政服务计划将大批量包裹运输价格提高25%

15:28

国家防总:苏皖防汛抗旱形势严峻复杂,淮河流域或有较重汛情

15:26

因地磁风暴影响,SpaceX“星链”卫星服务出现退化

15:25

国盛证券:预计2024年各地陆续出台支持低空经济发展配套政策

15:10

雄安新区:以更大力度引进在京央企二、三级子公司或创新业务板块

14:46

上海市副市长陈杰:加快运用AIGC等新一代信息技术手段赋能品牌

14:43

2024中国品牌价值榜发布:字节跳动、腾讯、工商银行位列前三

14:41

中国南水北调原副总经济师樊新中接受审查调查

14:23

辽宁锦州一药企突发大火致2人死亡

14:19

新疆阿克苏地区库车市发生3.4级地震

14:03

退役动力电池规范循环利用伙伴计划正式启动

13:58

欧佩克代表:伊拉克的减产言论像是“一贯的会前行为”

13:45

进出口银行成功发行第三期推动外贸保稳提质主题金融债券

13:38

陈茂波:香港各人才计划共收29万份申请,18万份已获批准

13:37

亚洲首艘圆筒型“海上油气加工厂”启运珠江口盆地

13:36

北京注册护士达15.3万人,互联网护理服务在全市推广

扫描下载App