Comment

Comment on 77% Of Employees Report AI Has Increased Workloads And Hampered Productivity, Study Finds

<- View Parent

WalnutLum@lemmy.ml ⁨6⁩ ⁨months⁩ ago

All the models I’ve used that do TTS/RVC and rotoscoping have definitely not produced professional results.

source

Sort:hotnew top

Hackworth@lemmy.world ⁨6⁩ ⁨months⁩ ago
PEBKAC, I’m afraid.

source
- WalnutLum@lemmy.ml ⁨6⁩ ⁨months⁩ ago
  Coqui for TTS, RVC UI for matching the TTS to the actor’s intonation, and DWPose -> controlnet applied to SDXL for rotoscoping
  
  source
  - Hackworth@lemmy.world ⁨6⁩ ⁨months⁩ ago
    Full open source, nice! I respect the effort that went into that implementation. I pretty much exclusively use 11 Labs for TTS/RVC, turn up the style, turn down the stability, generate a few, and pick the best. I do find that longer generations tend to lose the thread, so it’s better to batch smaller script segments.
    
    Unless I misunderstand ya, your controlnet setup is for what would be rigging and animation rather than roto. I do agree that while I enjoy the outputs of pretty much all the automated animators, they’re not ready for prime time yet. Although I’m about to dive into KREA’s new key framing feature and see if that’s any better for that use case.
    
    source
    WalnutLum@lemmy.ml ⁨6⁩ ⁨months⁩ ago
    I was never able to get appreciably better results from 11 labs than using some (minorly) trained RVC model :/ The long scripts problem is something pretty much any text-to-something model suffers from. The longer the context the lower the cohesion ends up.
    
    I do rotoscoping with SDXL i2i and controlnet posing together. Without I found it tends to smear. Do you just do image2image?
    
    source
    -> View More Comments