Abstract: Text encoders in diffusion models have rapidly evolved, transitioning from CLIP to T5-XXL. Although this evolution has significantly enhanced the models’ ability to understand complex ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results