Skip to main content

Overview

Strain Music Videos are fully produced AI music videos — one for every cannabis strain in the High IQ database. Each video stars Professor High, our cartoon cannabis scientist, as the main character living out the strain’s story: how it changes his day, his vibe, his world. The visuals match the song’s genre — G-Funk gets lowriders and palm trees, Dream Pop gets ethereal forests, Stoner Metal gets dark stages and smoke. Every video is generated automatically from the strain’s existing Lyria 3 AI song and Professor High’s mascot image. The result is a genre-appropriate 3-minute music video that captures the character of each strain in a way that text and photos alone cannot.
Strain Music Videos are built on top of the Strain Music Pipeline. A strain must have an AI-generated song before a video can be created.

How It Works

The pipeline has 6 stages that transform a strain’s existing song and mascot image into a complete music video:
1

Parse Lyrics

The strain’s timestamped lyrics are parsed into a structured timeline of sections (intro, verse, chorus, bridge, outro) with precise start and end times. Each section is split into individual clips of 10–20 seconds — matching how real music videos are edited, with multiple camera angles per verse.
2

Scene Director

An AI scene director (Gemini Flash Lite) reads the lyrics, music description, strain metadata, and genre to write a shot list. Each clip gets a scene description, camera direction, visual mood, and a shot type — either a performance shot (Professor High rapping or singing to camera) or a story shot (Professor High living the strain experience).
3

Scene Images

For each clip in the shot list, Gemini generates a 16:9 scene image (2048×1152) placing Professor High in the described setting. The Professor High reference image is included in every generation call to maintain character consistency. The first clip always starts from the strain’s existing mascot image to anchor the video in that strain’s visual identity.
4

Video Clips

Each scene image is animated into a video clip using fal.ai LTX-2.3 image-to-video. The clip’s prompt includes the scene description and camera direction (tracking shot, slow zoom, pan, etc.). Clips are generated in parallel — a 13-clip video runs up to 5 at a time.
5

Lip Sync

Performance shots — where Professor High is rapping or singing — get lip sync applied. The exact audio segment for that clip is extracted from the full song track and synced to the video clip using WaveSpeedAI. Story shots pass through untouched.
6

Compose and Upload

All clips are stitched together in order using ffmpeg with short crossfade transitions. The original Lyria 3 song is overlaid as the audio track, a thumbnail is extracted from the first chorus, and the final MP4 is uploaded to Supabase Storage. The strain’s record is updated with the video URL.

Pipeline Architecture

Location

packages/trigger/src/tasks/strain-video/
├── 01-parse-lyrics.ts
├── 02-scene-director.ts
├── 03-scene-images.ts
├── 04-video-clips.ts
├── 05-lip-sync.ts
├── 06-compose-upload.ts
├── orchestrator.ts
└── batch.ts
Same structure as the Strain Music Pipeline in packages/trigger/src/tasks/strain-music/.

Stage Summary

StageTaskModel / ToolPurpose
1 — Parse Lyricsstrain-video-parse-lyricsPure functionParse audio_lyrics into timed clip timeline
2 — Scene Directorstrain-video-scene-directorGemini 3.1 Flash LiteWrite shot list matching genre and lyrics
3 — Scene Imagesstrain-video-scene-imagesGemini 3.1 FlashGenerate 16:9 scene images with Professor High
4 — Video Clipsstrain-video-clipsfal.ai LTX-2.3Animate scene images into video clips
5 — Lip Syncstrain-video-lip-syncWaveSpeedAI LipSyncApply lip sync to performance shots
6 — Composestrain-video-composeffmpegStitch clips, overlay audio, upload

Provider-Agnostic Design

Both video generation and lip sync use swappable provider interfaces. The pipeline does not depend on any specific model — providers can be changed without modifying the pipeline stages. Launch providers:
  • Video: fal.ai LTX-2.3 image-to-video
  • Lip Sync: WaveSpeedAI LTX 2.3 LipSync (or LatentSync on Replicate)
Future provider options: Veo 3.1, Kling 3.0, Runway Gen-4, self-hosted LTX-2.3

Character Consistency

Professor High’s visual identity is maintained across all 6 stages by including the Professor High reference image (mascot-images/reference/professor-high.png) in every image and video generation call — the same pattern the music pipeline uses for multimodal context. The strain’s mascot image serves as the opening frame so each video begins in the strain’s established visual world.

Running the Pipeline

Single Strain (Dashboard)

Go to the Trigger.dev dashboard and run the strain-video-pipeline task with the strain slug as input.
{ "slug": "blue-dream" }
Use Preview Mode during testing to store all intermediates for review:
{ "slug": "blue-dream", "preview": true }

Batch Processing

Run strain-video-batch to process multiple strains. Pass specific slugs or a batch size:
{ "slugs": ["blue-dream", "og-kush", "wedding-cake"] }
{ "batchSize": 10 }
When no slugs are provided, the batch task draws from strains that have audio but no video, ordered by popularity rank.

Environment Variables

VariableRequiredPurpose
FAL_KEYYesfal.ai API key for video generation
WAVESPEED_API_KEYYesWaveSpeedAI key for lip sync
REPLICATE_API_TOKENOptionalReplicate key (alternative lip sync provider)

Mock Providers for Testing

Set these environment variables to use mock providers that skip external API calls and return placeholder files:
VIDEO_PROVIDER=mock
LIP_SYNC_PROVIDER=mock
Mock mode runs the full pipeline logic and stores all intermediates in Supabase — useful for testing the compose stage or debugging scene direction without incurring API costs.

Preview Mode

Preview mode stores every intermediate artifact for each strain so you can review and tune the output before committing to production:
PathContent
preview/{slug}/scenes.jsonFull scene breakdown from Stages 1–2 — shot list, timing, shot types
preview/{slug}/images/scene_{index}.pngGenerated scene image for each clip
preview/{slug}/clips/clip_{index}.mp4Individual video clip before lip sync
preview/{slug}/clips/clip_{index}_lipsync.mp4Lip-synced version of performance clips
Preview artifacts are accessible via the API:
GET /api/v1/strains/slug/{slug}/video/preview

Database and Storage

Supabase Storage

Bucket: strain-videos
Path PatternContent
{slug}_v{version}.mp4Final composed video
thumbnails/{slug}_v{version}.jpgThumbnail / poster frame
temp/{slug}/clip_{index}.mp4Temporary clips (cleaned up after compose)
preview/{slug}/...Preview mode intermediates

Database Columns (strains_v2)

ColumnTypeDescription
video_urltextFinal video URL in Supabase Storage
video_duration_secondsnumericDuration in seconds (~180s, matching audio)
video_versionintegerIncrements on each regeneration
video_generated_attimestamptzTimestamp of last generation
video_thumbnail_urltextPoster frame URL for video embeds
video_scenes_jsonjsonbFull scene breakdown for debugging and regeneration

API

The strain complete endpoint includes videoUrl and videoThumbnailUrl when a video exists:
GET /api/v1/strains/slug/{slug}/complete
All strains with both audio and video are available from the music endpoint:
GET /api/v1/strains/music

Cost Model

Each video costs approximately 3.503.50–5.00 to generate using hosted APIs. Self-hosting LTX-2.3 is a planned future option to reduce costs significantly.
StageToolCost per Video
Parse LyricsPure function$0.00
Scene DirectorGemini 3.1 Flash Lite~$0.01
Scene ImagesGemini 3.1 Flash (13 images)~$0.13
Video Clipsfal.ai LTX-2.3 (13 clips)~$1.50–3.00
Lip SyncWaveSpeedAI (~6 clips)~$1.80
Composeffmpeg (Trigger.dev compute)~$0.02
Total~$3.50–5.00
Clip count varies by song structure — a 3-minute song typically produces 12–14 clips depending on section lengths and how many splits the parser applies. The cost range above assumes 13 clips with ~6 performance shots.

Genre-to-Visual Style

The Scene Director maps each strain’s audio genre to a visual world. Examples:
GenreVisual Style
G-FunkLowriders, palm trees, 90s West Coast, warm amber light
Dream PopEthereal forests, soft bokeh, pastels, floating elements
Stoner MetalDark stage, smoke machines, dramatic lighting
Trip-HopUrban nightscapes, neon, noir aesthetic
Neo-SoulWarm interiors, golden hour, intimate vibes
Cloud RapSurreal landscapes, clouds, soft purple haze
Indie PopBright colors, urban streets, playful energy

Future Phases

Website Hero Integration

Replace the static mascot image on strain detail pages with the autoplaying music video as the hero visual.

YouTube Upload

Automatic upload to a dedicated strain music video YouTube channel after each video is generated.

Social Clips

Auto-generate 9:16 vertical crop edits from chorus sections for Instagram Reels and TikTok distribution.

Mobile Integration

Add the video to the strain detail carousel in the High IQ mobile app using expo-video.

Auto-Trigger

Fire video generation automatically as a fire-and-forget follow-on after the music pipeline completes.

Self-Hosted LTX-2.3

Run LTX-2.3 on dedicated GPU infrastructure to reduce per-video cost and improve generation speed.
  • Strain Music — The AI song pipeline that generates the audio each video is built on
  • Strain Discovery — Browse and search the full strain database
  • Label Scanner — Scan a product label to identify a strain and access its music video