hnlurker22 1 day ago

Maybe I can try implementing it. Do you know any open-source models/frameworks out there for replacing background music? Can audio be logically represented as layers like that (foreground/background)?

1
oyyagci 1 day ago

Please go ahead! I'd love to see where it goes and would be willing to help out. I've already opened an issue to track "adding support for user-provided audio" as we discussed, see: https://github.com/omeryusufyagci/fast-music-remover/issues/...

As for your question: it depends on the approach; Fast Music Remover currently uses DeepFilterNet, which has a deep learning approach and doesn't identify audio components as logical layers, which is why it's rather fast. Typically for that sort of requirements you'd want to work with a model like `demucs` (https://github.com/facebookresearch/demucs), that can identify individual audio components. That comes at great performance costs though.

However, my vision for the core of FMR is to support multiple ML models and provide the optimal solution for user needs, without them worrying about these details like which model to pick, etc. So, this would definitely be something I'd be interested to follow!