Overview
Strain Music Videos are fully produced AI music videos — one for every cannabis strain in the High IQ database. Each video stars Professor High, our cartoon cannabis scientist, as the main character living out the strain’s story: how it changes his day, his vibe, his world. The visuals match the song’s genre — G-Funk gets lowriders and palm trees, Dream Pop gets ethereal forests, Stoner Metal gets dark stages and smoke. Every video is generated automatically from the strain’s existing Lyria 3 AI song and Professor High’s mascot image. The result is a genre-appropriate 3-minute music video that captures the character of each strain in a way that text and photos alone cannot.Strain Music Videos are built on top of the Strain Music Pipeline. A strain must have an AI-generated song before a video can be created.
How It Works
The pipeline has 6 stages that transform a strain’s existing song and mascot image into a complete music video:Parse Lyrics
The strain’s timestamped lyrics are parsed into a structured timeline of sections (intro, verse, chorus, bridge, outro) with precise start and end times. Each section is split into individual clips of 10–20 seconds — matching how real music videos are edited, with multiple camera angles per verse.
Scene Director
An AI scene director (Gemini Flash Lite) reads the lyrics, music description, strain metadata, and genre to write a shot list. Each clip gets a scene description, camera direction, visual mood, and a shot type — either a performance shot (Professor High rapping or singing to camera) or a story shot (Professor High living the strain experience).
Scene Images
For each clip in the shot list, Gemini generates a 16:9 scene image (2048×1152) placing Professor High in the described setting. The Professor High reference image is included in every generation call to maintain character consistency. The first clip always starts from the strain’s existing mascot image to anchor the video in that strain’s visual identity.
Video Clips
Each scene image is animated into a video clip using fal.ai LTX-2.3 image-to-video. The clip’s prompt includes the scene description and camera direction (tracking shot, slow zoom, pan, etc.). Clips are generated in parallel — a 13-clip video runs up to 5 at a time.
Lip Sync
Performance shots — where Professor High is rapping or singing — get lip sync applied. The exact audio segment for that clip is extracted from the full song track and synced to the video clip using WaveSpeedAI. Story shots pass through untouched.
Compose and Upload
All clips are stitched together in order using ffmpeg with short crossfade transitions. The original Lyria 3 song is overlaid as the audio track, a thumbnail is extracted from the first chorus, and the final MP4 is uploaded to Supabase Storage. The strain’s record is updated with the video URL.
Pipeline Architecture
Location
packages/trigger/src/tasks/strain-music/.
Stage Summary
| Stage | Task | Model / Tool | Purpose |
|---|---|---|---|
| 1 — Parse Lyrics | strain-video-parse-lyrics | Pure function | Parse audio_lyrics into timed clip timeline |
| 2 — Scene Director | strain-video-scene-director | Gemini 3.1 Flash Lite | Write shot list matching genre and lyrics |
| 3 — Scene Images | strain-video-scene-images | Gemini 3.1 Flash | Generate 16:9 scene images with Professor High |
| 4 — Video Clips | strain-video-clips | fal.ai LTX-2.3 | Animate scene images into video clips |
| 5 — Lip Sync | strain-video-lip-sync | WaveSpeedAI LipSync | Apply lip sync to performance shots |
| 6 — Compose | strain-video-compose | ffmpeg | Stitch clips, overlay audio, upload |
Provider-Agnostic Design
Both video generation and lip sync use swappable provider interfaces. The pipeline does not depend on any specific model — providers can be changed without modifying the pipeline stages. Launch providers:- Video: fal.ai LTX-2.3 image-to-video
- Lip Sync: WaveSpeedAI LTX 2.3 LipSync (or LatentSync on Replicate)
Character Consistency
Professor High’s visual identity is maintained across all 6 stages by including the Professor High reference image (mascot-images/reference/professor-high.png) in every image and video generation call — the same pattern the music pipeline uses for multimodal context. The strain’s mascot image serves as the opening frame so each video begins in the strain’s established visual world.
Running the Pipeline
Single Strain (Dashboard)
Go to the Trigger.dev dashboard and run thestrain-video-pipeline task with the strain slug as input.
Batch Processing
Runstrain-video-batch to process multiple strains. Pass specific slugs or a batch size:
Environment Variables
| Variable | Required | Purpose |
|---|---|---|
FAL_KEY | Yes | fal.ai API key for video generation |
WAVESPEED_API_KEY | Yes | WaveSpeedAI key for lip sync |
REPLICATE_API_TOKEN | Optional | Replicate key (alternative lip sync provider) |
Mock Providers for Testing
Set these environment variables to use mock providers that skip external API calls and return placeholder files:Preview Mode
Preview mode stores every intermediate artifact for each strain so you can review and tune the output before committing to production:| Path | Content |
|---|---|
preview/{slug}/scenes.json | Full scene breakdown from Stages 1–2 — shot list, timing, shot types |
preview/{slug}/images/scene_{index}.png | Generated scene image for each clip |
preview/{slug}/clips/clip_{index}.mp4 | Individual video clip before lip sync |
preview/{slug}/clips/clip_{index}_lipsync.mp4 | Lip-synced version of performance clips |
Database and Storage
Supabase Storage
Bucket:strain-videos
| Path Pattern | Content |
|---|---|
{slug}_v{version}.mp4 | Final composed video |
thumbnails/{slug}_v{version}.jpg | Thumbnail / poster frame |
temp/{slug}/clip_{index}.mp4 | Temporary clips (cleaned up after compose) |
preview/{slug}/... | Preview mode intermediates |
Database Columns (strains_v2)
| Column | Type | Description |
|---|---|---|
video_url | text | Final video URL in Supabase Storage |
video_duration_seconds | numeric | Duration in seconds (~180s, matching audio) |
video_version | integer | Increments on each regeneration |
video_generated_at | timestamptz | Timestamp of last generation |
video_thumbnail_url | text | Poster frame URL for video embeds |
video_scenes_json | jsonb | Full scene breakdown for debugging and regeneration |
API
The strain complete endpoint includesvideoUrl and videoThumbnailUrl when a video exists:
Cost Model
Each video costs approximately 5.00 to generate using hosted APIs. Self-hosting LTX-2.3 is a planned future option to reduce costs significantly.| Stage | Tool | Cost per Video |
|---|---|---|
| Parse Lyrics | Pure function | $0.00 |
| Scene Director | Gemini 3.1 Flash Lite | ~$0.01 |
| Scene Images | Gemini 3.1 Flash (13 images) | ~$0.13 |
| Video Clips | fal.ai LTX-2.3 (13 clips) | ~$1.50–3.00 |
| Lip Sync | WaveSpeedAI (~6 clips) | ~$1.80 |
| Compose | ffmpeg (Trigger.dev compute) | ~$0.02 |
| Total | ~$3.50–5.00 |
Clip count varies by song structure — a 3-minute song typically produces 12–14 clips depending on section lengths and how many splits the parser applies. The cost range above assumes 13 clips with ~6 performance shots.
Genre-to-Visual Style
The Scene Director maps each strain’s audio genre to a visual world. Examples:| Genre | Visual Style |
|---|---|
| G-Funk | Lowriders, palm trees, 90s West Coast, warm amber light |
| Dream Pop | Ethereal forests, soft bokeh, pastels, floating elements |
| Stoner Metal | Dark stage, smoke machines, dramatic lighting |
| Trip-Hop | Urban nightscapes, neon, noir aesthetic |
| Neo-Soul | Warm interiors, golden hour, intimate vibes |
| Cloud Rap | Surreal landscapes, clouds, soft purple haze |
| Indie Pop | Bright colors, urban streets, playful energy |
Future Phases
Website Hero Integration
Replace the static mascot image on strain detail pages with the autoplaying music video as the hero visual.
YouTube Upload
Automatic upload to a dedicated strain music video YouTube channel after each video is generated.
Social Clips
Auto-generate 9:16 vertical crop edits from chorus sections for Instagram Reels and TikTok distribution.
Mobile Integration
Add the video to the strain detail carousel in the High IQ mobile app using expo-video.
Auto-Trigger
Fire video generation automatically as a fire-and-forget follow-on after the music pipeline completes.
Self-Hosted LTX-2.3
Run LTX-2.3 on dedicated GPU infrastructure to reduce per-video cost and improve generation speed.
Related Features
- Strain Music — The AI song pipeline that generates the audio each video is built on
- Strain Discovery — Browse and search the full strain database
- Label Scanner — Scan a product label to identify a strain and access its music video
