当前位置: 首页 > news >正文

【视频生成大模型】 视频生成大模型 THUDM/CogVideoX-2b

【视频生成大模型】 视频生成大模型 THUDM/CogVideoX-2b

  • CogVideoX-2b 模型介绍
    • 发布时间
    • 模型测试生成的demo视频
    • 生成视频限制
  • 运行环境安装
  • 运行模型
  • 下载
  • 开源协议
  • 参考

CogVideoX-2b 模型介绍

CogVideoX是 清影 同源的开源版本视频生成模型。

基础信息:

在这里插入图片描述

发布时间

2024年8月份

模型测试生成的demo视频

https://github.com/THUDM/CogVideo/raw/main/resources/videos/1.mp4

视频生成1

https://github.com/THUDM/CogVideo/raw/main/resources/videos/2.mp4

视频生成2

生成视频限制

  • 提示词语言 English*
  • 提示词长度上限 226 Tokens
  • 视频长度 6 秒
  • 帧率 8 帧 / 秒
  • 视频分辨率 720 * 480,不支持其他分辨率(含微调)

运行环境安装

# diffusers>=0.30.1
# transformers>=0.44.0
# accelerate>=0.33.0 (suggest install from source)
# imageio-ffmpeg>=0.5.1
pip install --upgrade transformers accelerate diffusers imageio-ffmpeg 

运行模型

import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_videoprompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance."pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-2b",torch_dtype=torch.float16
)pipe.enable_model_cpu_offload()
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
video = pipe(prompt=prompt,num_videos_per_prompt=1,num_inference_steps=50,num_frames=49,guidance_scale=6,generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]export_to_video(video, "output.mp4", fps=8)
  • Quantized Inference

PytorchAO 和 Optimum-quanto 可以用于对文本编码器、Transformer 和 VAE 模块进行量化,从而降低 CogVideoX 的内存需求。这使得在免费的 T4 Colab 或较小 VRAM 的 GPU 上运行该模型成为可能!值得注意的是,TorchAO 量化与 torch.compile 完全兼容,这可以显著加快推理速度。

# To get started, PytorchAO needs to be installed from the GitHub source and PyTorch Nightly.
# Source and nightly installation is only required until next release.import torch
from diffusers import AutoencoderKLCogVideoX, CogVideoXTransformer3DModel, CogVideoXPipeline
from diffusers.utils import export_to_video
from transformers import T5EncoderModel
from torchao.quantization import quantize_, int8_weight_only, int8_dynamic_activation_int8_weightquantization = int8_weight_onlytext_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX-2b", subfolder="text_encoder", torch_dtype=torch.bfloat16)
quantize_(text_encoder, quantization())transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX-5b", subfolder="transformer", torch_dtype=torch.bfloat16)
quantize_(transformer, quantization())vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX-2b", subfolder="vae", torch_dtype=torch.bfloat16)
quantize_(vae, quantization())# Create pipeline and run inference
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-2b",text_encoder=text_encoder,transformer=transformer,vae=vae,torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()# prompt 只能输入英文
prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance."video = pipe(prompt=prompt,num_videos_per_prompt=1,num_inference_steps=50,num_frames=49,guidance_scale=6,generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]export_to_video(video, "output.mp4", fps=8)

下载

model_id: THUDM/CogVideoX-2b
下载地址:https://hf-mirror.com/THUDM/CogVideoX-2b 不需要翻墙

开源协议

License: apache-2.0

参考

  • https://hf-mirror.com/THUDM/CogVideoX-2b
  • https://github.com/THUDM/CogVideo

http://www.mrgr.cn/news/56128.html

相关文章:

  • tracert和ping的区别
  • MySQL8.0.28解压版安装windows
  • C++类和对象之对象模型和this指针
  • Ansible自动化工具
  • WebGoat SQL Injection (intro) 源码分析
  • leetcode22.括号生成
  • Xcode真机运行正常,打包报错
  • 32匿名函数
  • uni-app组件使用(uv-ui)
  • 无线电测向运动80m(3.5-3.6MHz)信号源制作
  • c++ libtorch tensor 矩阵分块
  • 壹肆柒·2025郑州台球展会,3月12-14日盛大举办
  • Flink PostgreSQL CDC源码解读:深入理解数据流同步
  • 跨站调用(CORS)一例
  • 基础数据结构——队列(链表实现,数组实现)
  • 期权懂|看涨期权合约的买方具有什么义务?
  • L0G1000 Linux 基础知识
  • 【计网】理解TCP全连接队列与tcpdump抓包
  • 免费赠书啦,免费赠送多本《数据资产管理核心技术与应用》一书
  • webAPI中的offset、client、scroll
  • 【vue之道】
  • python中如何获取对象信息
  • 详解Java之Spring MVC篇一
  • SSM网上书店管理系统—计算机毕业设计源码41539
  • 如何将 Docker 镜像的 tar 文件迁移到另一台服务器并运行容器
  • 焊接原因引起的RJ45网口连接器LED灯不亮原因分析及处理措施