HappyHorse 1.0 Open Source Guide: How to Install, Run & Fine-Tune the #1 AI Video Model
HappyHorse 1.0 is the first #1-ranked AI video model to go fully open source with commercial rights. This guide walks you through installation, configuration, fine-tuning for your brand, and deployment โ whether self-hosted, cloud-based, or via managed platforms.
Table of Contents
What's Included in the Open Source Release
When you get HappyHorse 1.0 open-source, you're getting a production-ready AI video generation system with all the components needed to build commercial video applications.
Base Model Weights (15B Parameters)
Full model with 15 billion parameters. The core AI trained on 2M+ video-text pairs.
Distilled Model (8-Step)
Optimized for speed with 8 inference steps instead of 50. 10x faster but slightly lower quality.
Super-Resolution Module
Upscales generated videos from 256p to 4K. Essential for professional output.
Inference Code
Optimized PyTorch code for generation, with batch processing and memory optimization.
Python SDK
Simple API for text-to-video, image-to-video, and batch generation workflows.
REST API Server
FastAPI server for running HappyHorse as a service. Deploy locally or to cloud.
Commercial License
Full commercial rights for all generated videos. No attribution required.
Technical Documentation
Detailed guides for installation, fine-tuning, deployment, and troubleshooting.
Hardware Requirements
Minimum Setup
- โขNVIDIA A100 (40GB) or H100 (40GB minimum)
- โข256GB system RAM
- โข500GB SSD storage for models
- โขCUDA 12.1+, cuDNN 9.0+
- โข1080p output: ~38 seconds per video
Recommended Setup
- โขNVIDIA H100 (80GB) or 2x A100 (80GB total)
- โข512GB system RAM
- โข1TB NVMe SSD
- โขCUDA 12.1+, cuDNN 9.0+
- โข1080p output: ~15 seconds per video
- โขFP8 quantization support
FP8 Quantization Tip
Use FP8 quantization (torch.float8_e4m3fn) to reduce memory by 50% with minimal quality loss. This allows running on A100 40GB instead of requiring H100 80GB.
Step-by-Step Installation Guide
Prerequisites
- โNVIDIA GPU with minimum 40GB VRAM (A100, H100, or RTX 6000 Ada)
- โCUDA 12.1+ and cuDNN 9.0+ installed
- โPython 3.10 or 3.11
- โgit and pip package manager
- โAt least 500GB free disk space
1. Clone the Repository
Get the official HappyHorse code from GitHub.
git clone https://github.com/happyhorse-ai/happyhorse-1.0.git && cd happyhorse-1.02. Create Virtual Environment
Isolate dependencies in a Python virtual environment.
python3.10 -m venv venv && source venv/bin/activate3. Install PyTorch with CUDA Support
Install PyTorch built for your CUDA version.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu1214. Install HappyHorse Dependencies
Install the required libraries and the HappyHorse package.
pip install -r requirements.txt && pip install -e .5. Download Model Weights
Download the 15B base model and distilled model from Hugging Face.
python -m happyhorse.download_models --model-size all- โBase model: ~30GB (15B parameters)
- โDistilled model: ~15GB (8-step inference)
- โSuper-resolution module: ~2GB
- โModels cached in ~/.cache/huggingface/hub
6. Verify Installation
Test that everything works with a simple inference.
python -c "from happyhorse import HappyHorseModel; print('Installation successful!')"Basic Usage: Python Example
import torch
from happyhorse import HappyHorseModel
# Load the model
model = HappyHorseModel.from_pretrained(
"happy-horse/happyhorse-1.0",
device="cuda",
dtype=torch.float8_e4m3fn # For FP8 quantization
)
# Generate video from text
prompt = "A woman in a blue dress holding our skincare product, smiling at the camera"
video, audio = model.generate(
prompt=prompt,
duration_seconds=5,
fps=24,
aspect_ratio="16:9",
height=1080
)
# Save output
video.save("output.mp4")
audio.save("output.wav")
# Generate video with image conditioning
from PIL import Image
image = Image.open("product_image.jpg")
video_from_image, audio = model.generate(
image=image,
prompt="Show the product features, zoom in on the packaging",
duration_seconds=8,
fps=24
)
# Batch generation for multiple scripts
scripts = [
"Woman in gym holding protein powder",
"Man at home desk with laptop",
"Group of friends laughing with phone"
]
for script in scripts:
video, audio = model.generate(prompt=script, duration_seconds=5)
video.save(f"video_{scripts.index(script)}.mp4")Key Features Deep Dive
Text-to-Video Generation
Generate videos directly from text prompts. Perfect for quick iterations and A/B testing.
- โPrompt length: 10-500 characters
- โDuration: 2-30 seconds
- โFPS: 12-60 (default 24)
- โResolution: 256p to 4K (with super-resolution)
- โAspect ratios: 9:16, 16:9, 1:1, 4:5 supported
Image-to-Video Generation
Condition generation on a product image or reference photo. Creates dynamic videos from static images.
- โInput: PNG/JPG images (any resolution)
- โOutput: 5-30 second videos
- โMaintains composition while adding motion
- โGreat for product showcases and unboxing content
Audio-Video Synchronization
Auto-generate or sync with existing audio. Lip-sync happens automatically with speech detection.
- โAutomatic lip-sync for 175+ languages
- โSupports uploaded audio files or text-to-speech
- โDetects speech and synchronizes mouth movements
- โNo manual timing required
Batch Processing
Generate multiple videos efficiently in a single call. Perfect for scaling campaigns.
- โProcess 50+ videos in parallel
- โAutomatic queue management
- โGPU memory optimization
- โProgress tracking and resumable batches
Fine-Tuning with LoRA
Customize the model with your brand style without full retraining.
- โLoRA rank: 8-128 (64 recommended)
- โTraining time: 2-8 hours on H100
- โMemory efficient: 40GB GPU only
- โPreserves base model quality
Fine-Tuning Guide: Brand Customization
While HappyHorse is excellent out-of-the-box, fine-tuning allows you to specialize it for your brand's specific style, products, and visual language. This takes 2-8 hours of GPU time and significantly improves output consistency.
When to Fine-Tune Your Model
- โขYou have a distinctive brand style (color palette, lighting, composition)
- โขYou need consistent product demonstrations or unboxing videos
- โขYou're generating 50+ videos per month for the same brand
- โขYou want to match specific spokesperson aesthetics or brand ambassadors
- โขYou need multilingual content in your brand's visual style
LoRA Fine-Tuning Code Example
from happyhorse import LoRATrainer
# Prepare training data
train_dataset = {
"images": ["brand_img_1.jpg", "brand_img_2.jpg"],
"captions": [
"Woman holding blue cosmetic bottle in bright lighting",
"Product closeup showcasing glass packaging"
]
}
# Initialize LoRA trainer
trainer = LoRATrainer(
model="happy-horse/happyhorse-1.0",
lora_rank=64,
learning_rate=1e-4,
num_epochs=10,
batch_size=4
)
# Train with your brand data
trainer.train(
images=train_dataset["images"],
captions=train_dataset["captions"],
output_dir="./lora_checkpoints"
)
# Use fine-tuned model
model.load_lora("./lora_checkpoints/final")
video, audio = model.generate(
prompt="Woman in office with our branded product",
duration_seconds=5
)
video.save("branded_output.mp4")Training Data Requirements
- Minimum Data:10-20 high-quality images with detailed captions
- Recommended Data:50-100 images spanning different product angles, lighting, contexts
- Image Format:PNG or JPG, any resolution (auto-resized to 768x768)
- Captions:Detailed 20-50 word descriptions of each image (what you see, action, style)
Compute Requirements for Fine-Tuning
LoRA fine-tuning requires an A100 40GB or H100 with 10GB available memory. Training on 100 images takes 4-6 hours on H100 or 8-10 hours on A100 40GB. You can use cheaper GPUs by reducing batch size from 4 to 1 (adds 2-3 hours).
Deployment Options
Local Deployment
Run on your own GPU machine. Best for development and testing.
AWS Deployment
Launch on EC2 with g4dn or p3 instances. Use ECS for containerization.
Google Cloud (GCP)
Deploy on Compute Engine or use Vertex AI. A100 GPUs available on-demand.
Microsoft Azure
Use N-series VMs with H100 or A100. Integrated with Azure ML for scaling.
Paperspace / Lambda Labs
GPU cloud platforms pre-optimized for ML. Simple setup, pay-per-hour.
Docker Containerization
# Dockerfile
FROM nvidia/cuda:12.1-runtime-ubuntu22.04
RUN apt-get update && apt-get install -y \
python3.10 python3-pip git \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python3", "-m", "happyhorse.server", "--host", "0.0.0.0", "--port", "8000"]
# requirements.txt
torch==2.1.0
torchvision==0.16.0
happyhorse==1.0.0
fastapi==0.104.1
uvicorn==0.24.0
python-multipart==0.0.6
pillow==10.1.0Comparison: Self-Hosted vs API vs UGCFast
| Aspect | Self-Hosted | HappyHorse API | UGCFast Platform |
|---|---|---|---|
| Setup Complexity | High (GPU, CUDA, dependencies) | Low (API key only) | None (web interface) |
| GPU Cost | $3,000-8,000 upfront | $0 upfront | Included in subscription |
| Cost per Video | $0.50-2.00 (electricity only) | $1-5 per video | $0.30-1.50 (volume-dependent) |
| Monthly for 100 Videos | $50-200 (electricity) | $100-500 | $30-150 |
| Latency | 2-40 seconds | 5-60 seconds | Instant (queued) |
| Batch Processing | Unlimited | Limited by rate limits | Built-in, 300+ concurrent |
| Fine-Tuning | Fully supported | Limited or unavailable | Managed fine-tuning |
| Maintenance | You handle updates, backups | Vendor handles | Fully managed |
| Best For | High-volume production, custom workflows | Low-volume, no infrastructure | Growing brands, managed simplicity |
Self-Hosted
HappyHorse API
UGCFast
Community & Resources
GitHub Repository
Official source code, issues, and discussions. File bugs and contribute.
Visit โHugging Face Model Hub
Pre-trained weights, model cards, and community discussions.
Visit โTechnical Report
Full paper with architecture details, training methodology, and benchmarks.
Visit โDiscord Community
Real-time support, tips from other users, and announcements.
Visit โDocumentation Site
API reference, troubleshooting guides, and best practices.
Visit โJupyter Notebooks
Interactive examples for common workflows and use cases.
Visit โFrequently Asked Questions About AI UGC Video Generation
Ready to Generate AI Videos?
Whether you choose self-hosted HappyHorse or prefer a managed platform, start creating professional video content today.
Get Started FreeNo commitment. Cancel anytime. Starting at $29/month after trial.