Multimodal Generation
Any-to-any multimodal generation
Hugging Face Transformers task for any-to-any multimodal generation; it supports combinations of text, image, audio, and video inputs and can generate outputs across those modalities.

Recent stories
0 linked stories
No linked stories yet.