Pavlo Puzikov
Back to work

10·AI / ML·Passion project

Bonya Studio

In developmentOwner - architecture, orchestration, pipeline interaction.

AI image, video, and audio production suite. Orchestrates ComfyUI, Wan2GP, LTX-Video, Piper, Whisper, Ollama, and FFmpeg behind one Next.js dashboard. Pipelines for cinematic, explainer, clip factory, social reel, property tour, and brand story formats.

Bonya Studio cover — brutalist concrete observatory at twilight with a matte-black telescope, electric-blue horizon glow through the arched window, and the Bonya wordmark in editorial sans-serif with 'STRATEGY IS GEOMETRY.' in mono yellow beneath.

What it took

The skills behind this project.

Every project below leans on a primary discipline and a handful of secondary ones. Tap any chip to see how that skill plays out across the wider portfolio.

Skills demonstrated

  • Orchestrates ComfyUI, LTX-Video, Whisper, Ollama, Piper, and FFmpeg behind one dashboard for image, video, and audio.

  • Local-first orchestration with optional remote GPU routing; queue, retry, and observability per pipeline.

  • Next.js pipeline interaction surface with per-format previews and parameter brushing.

  • Format taxonomy covering cinematic, explainer, clip factory, social reel, property tour, and brand story.

Context

Why it exists.

Reel Estate needs reels at production cost. The BARNES marketing team needs property tours at production cost. Threadwork needs audio reading deliverables at production cost. The pattern repeats across half the projects here — and renting frontier media-gen APIs at AED 5+ per output makes the unit economics nonsensical for any pay-per-volume product.

Bonya Studio is the answer: a fully open-source media-generation stack orchestrated behind one Next.js dashboard. ComfyUI for image, LTX-Video and Wan2GP for video, Whisper plus Piper for audio, Ollama for the narration scripts, FFmpeg for assembly. Local-first by default; remote GPU only when a job exceeds the local card's memory.

StackNext.js · ComfyUI · LTX-Video · Whisper · Ollama · FFmpeg

Process

The decisions that shaped it.

  1. 01

    OSS-only, by policy

    Picked LongCat-Video, FLUX, SD3.5, LTX-Video, VEnhancer, Real-ESRGAN, BiRefNet, SAM 2, LatentSync, F5-TTS, MusicGen, Lyra as the stack. No hosted API in the path. Means every job's cost is electricity, every weight is auditable, and a model deprecation never breaks a paying customer's pipeline.

    AI & Machine Learning
  2. 02

    Pipelines as named formats, not parameter dialogs

    Cinematic, explainer, clip factory, social reel, property tour, brand story — six pipelines, each opinionated about which engine handles which stage. The dashboard exposes the format name; the user does not touch ComfyUI directly unless they want to. Lowers the operator skill ceiling so the marketing lead can run a job without the engineer.

    Creative Direction
  3. 03

    Local-first orchestration with optional remote GPU

    Jobs queue on the local card by default; an env-toggled adapter routes overflow (or specific stages) to a rented GPU only when it has to. Means the studio runs free on a workstation in the off hours, and scales up only when a customer's job exceeds local memory.

    Backend Engineering
  4. 04

    One dashboard, every pipeline

    The Next.js surface lets the operator queue, inspect, retry, and download outputs across all six pipelines. Each pipeline gets a per-format preview and parameter brush. Visible status — queued, generating, post-processed, ready — collapses what would otherwise be six separate tools.

    Frontend Engineering

Outcome

What shipped.

Local-first by default with optional remote GPU routing.

In development as the production backbone for Reel Estate and the BARNES marketing reel runs. The OSS-only constraint is the strategic moat — every other pay-per-listing product in the market is rent-seeking on a hosted API; Bonya Studio runs at electricity cost, which means the wedge pricing on Reel Estate is sustainable.