Skip to main content
AI Media & Transcription

MusicGen on Your Homelab: Stop Paying for Background Music

· · 5 min read

You know that moment when you’re recording a video, streaming gameplay, or building a project demo and you realize you need background music? Welcome to the subscription trap. Epidemic Sound wants $15/month. Artlist wants more. YouTube Audio Library has the personality of a dental office waiting room.

I just stopped caring about all of that. I’m running MusicGen on my homelab now, and I can generate as much royalty-free music as I want in real-time — no limits, no monthly bill, no corporate intermediary.

Here’s the thing: MusicGen is Meta’s AI model that generates full music tracks from text descriptions. You type “upbeat lo-fi hip hop beat with vinyl crackle” and it makes it. You can even hum a melody and it’ll arrange a full song around it. It’s not replacing a composer, but for background music, intro jingles, and ambient soundscapes? It’s genuinely better than paying for stock music libraries.

What Makes MusicGen Actually Good

Unlike most AI music tools that sound like AI music, MusicGen outputs feel intentional. They’re not perfect, but they’re usable immediately — no “please buy our premium model” friction.

The real killer feature is melody conditioning. You record yourself humming for 5 seconds, upload it, and MusicGen builds a full arrangement around your melody. I’ve used it to create personal jingles for podcast intros and it’s embarrassingly good.

You get multiple model sizes: Small (500MB), Medium (1.5GB), and Large (3.5GB). Small runs on a potato. Large sounds closer to professional production. I use Medium as the sweet spot — decent quality without melting my GPU.

The money talk: Epidemic Sound costs $15/month × 12 = $180/year minimum. MusicGen costs you nothing after setup. If you generate 100 tracks this year (which is realistic if you make content), that’s $1.80 per track with Epidemic. MusicGen is free.

The Install (It’s Stupidly Easy)

You have two paths: Docker (recommended) or bare metal with Python. I’m showing you Docker because it’s reproducible and won’t pollute your system.

Here’s a Docker Compose setup that gives you a web UI and API:

version: '3.8'
services:
  musicgen:
    image: ghcr.io/oobabooga/text-generation-webui:latest
    container_name: musicgen
    restart: unless-stopped
    ports:
      - "7860:7860"
    volumes:
      - ./models:/app/models
      - ./outputs:/app/outputs
    environment:
      - GRADIO_SERVER_NAME=0.0.0.0
      - GRADIO_SERVER_PORT=7860
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Wait — that’s a general WebUI. For MusicGen specifically, you want the official audiocraft repo. Here’s the cleaner approach:

git clone https://github.com/facebookresearch/audiocraft.git
cd audiocraft
pip install -e .
pip install gradio

Then grab this minimal Gradio wrapper (create a file called musicgen_ui.py):

import gradio as gr
from audiocraft.models import MusicGen
import torchaudio

model = MusicGen.get_model('medium')
model.set_generation_params(duration=30)

def generate(description, top_k=250, top_p=0.0, temperature=1.0, cfg_coef=3.0):
    model.generation_params = {'top_k': top_k, 'top_p': top_p, 
                               'temperature': temperature, 'cfg_coef': cfg_coef}
    wav = model.generate([description])
    return (16000, wav[0].cpu().numpy().T)

with gr.Blocks() as demo:
    gr.Markdown("# MusicGen")
    text_input = gr.Textbox(label="Describe the music", placeholder="upbeat electronic dance music")
    audio_output = gr.Audio(label="Generated Music", type="numpy")
    submit = gr.Button("Generate")
    submit.click(generate, inputs=text_input, outputs=audio_output)

demo.launch(server_name="0.0.0.0", server_port=7860)

Run it with python musicgen_ui.py. Hit http://localhost:7860 and you’re done. The first generation downloads the model weights (~3.5GB for Medium) then you’re cooking.

GPU strongly recommended. I’m running this on a RTX 3060 and a 30-second track takes about 8 seconds. CPU mode works but expect 2-3 minutes per track. Not fun.

Real-World Use Cases (And Why You Actually Need This)

I’ve been running this for 3 months. Here’s what I actually use it for:

  • YouTube intros: “80s synthwave vaporwave intro with reverb” generates a 10-second clip in 5 seconds. Beats $5 stock music.
  • Stream backgrounds: “ambient spacey pad with slow evolution” for 8-hour livestream loops. No copyright claims. Ever.
  • Project demos: Need incidental music for a homelab showcase? “minimal tech ambience” solves it instantly.
  • Podcast bumpers: Generate 3-second jingles for chapter breaks. Personalized, free, yours.
  • Game dev temp audio: Placeholder music while your actual composer works. Better than silence.

The melody conditioning is the party trick though. I hummed a 5-second guitar riff, uploaded it, and got back a full 30-second arrangement with drums and bass. Would’ve cost $50 on Fiverr.

Integrating Into Your Homelab Workflow

If you’re running Home Assistant, you can trigger music generation via automations. Recording a video? Webhook call to MusicGen, grab the output file, drop it into your editing timeline.

Throw it behind Traefik with basic auth if you want to expose it safely:

labels:
  - "traefik.enable=true"
  - "traefik.http.routers.musicgen.rule=Host(`musicgen.yourdomain.com`)"
  - "traefik.http.routers.musicgen.entrypoints=websecure"
  - "traefik.http.routers.musicgen.tls.certresolver=letsencrypt"
  - "traefik.http.middlewares.musicgen-auth.basicauth.users=admin:$$apr1$$..."
  - "traefik.http.routers.musicgen.middlewares=musicgen-auth"

Now you’ve got a private music generation API accessible from anywhere. Integrate it with your media server, automation engine, whatever. The API is dead simple — POST a description, get back audio.

The Honest Limitations

MusicGen isn’t a replacement for hiring a musician. The models occasionally hallucinate weird artifacts. Sometimes descriptions don’t translate how you expect. Long descriptions don’t always help — short, specific prompts work better.

And yeah, you need GPU compute. This isn’t a Raspberry Pi job. But if you already have a homelab with compute, the marginal cost of running this is basically free.

The generated music is yours to use commercially. Meta’s licensing is clean here — no weird restrictions, no “credit us” requirements. That’s why this beats subscription services; you’re not renting access, you’re running the tool.

Why This Matters

Every month you’re paying Epidemic Sound, you’re paying for the privilege of not owning anything. MusicGen is the opposite. You download it once, run it forever, own every output.

If you make any kind of content — YouTube, Twitch, podcasts, game dev — you should have this running. The time savings alone (no more searching stock libraries) pays for the homelab electricity in a month.

Set it up this weekend, generate your first track, and realize how absurd it is that we’ve been paying monthly subscriptions for something this simple.

Explore MusicGen in our AI Homelab Toolkit.

Share this article