video-multimodal-generate
Multi-modal hybrid video generation that combines audio, video, image, and text inputs in any mix. Suitable for music-to-video, extracting camera motion and rhythm references from video, and multi-source blended generation. At least one media type must be provided (text-only generation is not supported).
meitu video-multimodal-generate
Usage Examples
# Music-driven video (required parameters + audio)
meitu video-multimodal-generate \
--reference_audio_list ./music.mp3 \
--prompt "Visuals that follow the rhythm of the music" \
--json
# Image + video driven
meitu video-multimodal-generate \
--image_list ./style.jpg \
--reference_video_list ./camera.mp4 \
--prompt "Camera motion reference + reference image style" \
--json
# Multiple sources with full parameters and result download
meitu video-multimodal-generate \
--image_list ./ref1.jpg \
--reference_video_list ./ref.mp4 \
--reference_audio_list ./bgm.mp3 \
--prompt "Multi-source blended generation" \
--video_duration 8 \
--ratio 16:9 \
--sound on \
--json \
--download-dir ./outputParameters
| Parameter | Required | Description |
|---|---|---|
--image_list | No | Type: string[]; optional reference images (up to 9) |
--reference_video_list | No | Type: string[]; optional reference videos (up to 3; total duration max 15 seconds) |
--reference_audio_list | No | Type: string[]; optional audio drive (up to 3; total duration max 15 seconds) |
--prompt | Yes | Type: string; generation description (*at least one media type must be provided; total assets max 12) |
--video_duration | No | Type: number; default: -1 (auto); generated video duration |
--ratio | No | Type: string; aspect ratio |
--sound | No | Type: string; options: on / off; whether to include audio |
--resolution | No | Type: string; output resolution |
--download-dir | No | Type: string; downloads result files to the specified local directory |
--output | No | Type: string[]; specifies output file paths, mapped in order to data.result.urls |
--json | No | Outputs results in JSON format for script or agent parsing |