
Hawaii Snack Squad — Blind Box Character Collection
A node-based AI canvas changes the game for cohesive multi-character production — chaining prompts through a shared style guide is the difference between a collection and a pile of unrelated images.
- Role
- Creative Director
- Timeline
- 2 days
- Year
- 2026
Executive Summary
Hawaii Snack Squad is a fully realized 6-character blind box collectible series, art-directed and produced end-to-end by a single creator using a chained AI pipeline in Melius with ElevenLabs. By first locking a stylized visual guide and then referencing it across all downstream generations, the project delivers a cohesive character lineup, individual animated hero shots, a group dance sequence, and a narrated promo video—demonstrating how node-based AI workflows can achieve production-ready IP development with studio-level consistency and motion from one contributor.
Context
Hawaii Snack Squad — Concept Overview
Hawaii Snack Squad is a 6-character blind box collectible series where each figure embodies an iconic Hawaiian snack or cultural food, unified by a shared world, visual language, and brand story. The collection is designed not just as static figures, but as a fully pitch-ready IP with cinematic moments and a cohesive lineup reveal.
---
Collection Pillars
- Individual Character Appeal
- →Each character is a snack-first personality: bold silhouettes, clear food references, and instantly readable traits.
- →Designs emphasize expressive faces, chunky toy-like proportions, and collectible-friendly details (swappable accessories, variant colorways).
- Collection Cohesion
- →Unified chibi-meets-vinyl style: rounded forms, soft shading, saturated but slightly pastel palette inspired by Hawaii’s skies, ocean, and flora.
- →Shared design language: similar eye shapes, consistent limb proportions, and a common emblem (a small hibiscus + wave badge) appearing somewhere on each character.
- →All characters live in the same world: a stylized, snack-sized version of Hawaii where food is culture, and each snack is a local hero.
- Cinematic Delivery
- →Each character receives a hero moment: a short, cinematic shot that highlights their personality, motion, and environment.
- →A lineup choreography sequence showcases all 6 together, emphasizing set completion and collectibility.
- →A narrated video introduces the Hawaii Snack Squad as a brand-ready universe, suitable for product launch decks, crowdfunding campaigns, or licensing pitches.
Constraints
- →Visual consistency across 6 characters: Ensured all characters shared a unified illustration style, color language, and line treatment so they clearly belonged to the same brand world, minimizing style drift across separate generation sessions.
- →No 3D pipeline: Designed all assets as 2D illustrations and motion treatments, avoiding any dependence on 3D modeling, rigging, or rendering tools, and tailoring the workflow to flat art only.
- →Single contributor: Structured the pipeline so one person could handle concepting, character design, illustration, motion direction, and final video delivery end-to-end, with tight version control and reusable templates.
- →Cinematic video from static images: Used image-to-video generation focused on expressive camera moves, parallax, and character motion cues derived from still illustrations, prioritizing controllable, non-glitchy movement over raw spectacle.
- →Coherent group lineup: Built the 6-character group shot by chaining references from the individually approved designs (poses, proportions, palettes) to maintain scale, perspective, and style coherence in a single believable scene.
- →Voiceover fit: Selected and tuned an ElevenLabs narrator voice that matched the Hawaii Snack Squad’s playful, warm, family-friendly tone, aligning pacing and emphasis with the visual rhythm of the edit.
Strategy
3.1 Phase 1 — Visual Style Guide (The Style Lock)
The foundational decision was to generate the style guide before generating any characters. Using Melius's node-based canvas, I built a style guide generation chain — establishing the illustration style, color palette, line weight, character proportions, and overall visual world of Hawaii Snack Squad as a single reference asset before any characters were produced.
This style guide became the fixed reference node that fed every downstream generation — ensuring consistent art direction across all 6 characters without relying on session memory or prompt repetition alone.
3.2 Phase 2 — Character Generation (Node-Based Reference Chaining)
With the style guide locked, I built individual character generation chains in Melius — each one referencing the style guide output alongside a character-specific prompt describing the Hawaiian snack concept, personality, color variation, and expression.
Melius's node canvas was critical here: rather than running 6 separate, unconnected generation sessions, each character pipeline was visually connected to the same upstream style guide node. This allowed iteration on individual characters (adjusting prompt, style weight, or seed) while keeping the shared visual language stable.
The 6 characters produced for the collection:
- Spam Musubi
- Shave Ice
- Malasada
- Loco Moco
- Manapua
- Li Hing Gummy Bear
3.3 Phase 3 — Image-to-Video: Individual Hero Shots
Each of the 6 character images was pushed through Melius's image-to-video generation to produce a cinematic hero shot — short, looping motion clips that gave each character a moment of life. Prompts directed the camera motion (subtle push-in, character idle animation, ambient environment motion) while keeping the 2D illustrated style intact.
3.4 Phase 4 — Group Lineup Shot + Dance Sequence
With all 6 characters complete, I built a lineup composition chain in Melius referencing all 6 individual character images simultaneously as inputs to generate a cohesive group shot. The node canvas made this multi-reference composition possible without manual photoshop compositing.
The lineup image was then pushed through image-to-video generation with a dance sequence prompt — producing a synchronized group choreography clip that became the collection's hero video moment.
3.5 Phase 5 — Voiceover (ElevenLabs)
Using ElevenLabs voice generation, I selected and tuned a narrator voice that matched the playful, warm, and slightly theatrical tone of the Hawaii Snack Squad world. The voiceover script introduced the collection and each character, synced to the video package.
Challenges
This project surfaced several production frictions when working with a new multi-modal AI tool (Melius) under real creative constraints:
- Style Consistency Across 6 Characters
- →Core challenge: avoiding visual drift (style, proportions, line quality) across characters generated in separate sessions.
- →Mitigation: a persistent upstream style guide node in the Melius canvas, plus careful, per-character prompt calibration to keep every output on-model.
- Image Iteration Workflow
- →Problem: first-pass generations were often imperfect.
- →Solution: regenerate using updated text prompts while keeping the same style guide, and rely on Melius’s per-image iteration history to move backward/forward through versions.
- Group Lineup Composition (6 Characters)
- →Challenge: combining six separately generated characters into a single, believable group shot without losing individual identity.
- →Technique: multi-reference chaining, iteratively refining prompts to balance proportion, spacing, and visual weight across the lineup.
- Image-to-Video Quality & Control
- →Limitation: image-to-video had compelling visuals but weak fine-grained motion control, often producing random warping.
- →Strategy: separate camera behavior from character behavior in prompts, use shorter clip durations, and run multiple passes to cherry-pick the most intentional-looking hero shots.
- Dance Sequence Choreography (Group Video)
- →Ambition: a synchronized dance sequence with all 6 characters in one composition.
- →Approach: detailed prompt engineering around synchronization, rhythm, and motion type, while accepting that results would be directionally correct rather than frame-perfect choreography.
- Voiceover Tone Calibration (ElevenLabs)
- →Goal: a narrator that matched the Hawaii Snack Squad’s warm, playful, slightly absurdist tone instead of a generic ad voice.
- →Process: iterate on both voice profile selection and script pacing, testing multiple combinations until the voice felt native to the world.
- New Tool Bugs & Workflow Resilience
- →Risk: Melius is a new tool; at one point all nodes were deleted with no historical snapshots available.
- →Workaround: maintain multiple canvases and periodically duplicate all active nodes into a separate snapshot canvas to preserve progress.
Overall, the work balanced rapid experimentation with deliberate systems for consistency, recoverability, and tone control across images, video, and voice.
Outcomes & Impact
Hawaii Snack Squad Delivery Summary
Deliverables Completed
- →Visual style guide for the Hawaii Snack Squad collection
- →Unified 3D style
- →Color palette, line weight, shading rules, and expressions
- →Style reference chaining documented for reuse in Melius
- →6 character character illustrations
- →6 individual cinematic hero shot videos (image-to-video)
- →1 group lineup composition image
- →1 lineup dance sequence video (image-to-video)
- →7 generated audio clips (ElevenLabs voiceover)
- →Documented Melius node canvas workflow
The most significant outcome is the demonstration of node-based AI canvas as a production tool for coherent collection design. The Melius workflow — style guide → character generation → image-to-video → group composition → dance sequence → voiceover — is a repeatable template for any blind box, character line, or collectible product concept.
This pipeline collapses what would traditionally require a team (concept artists, illustrators, animators, voice directors) into a single-contributor workflow, while producing output at a quality level suitable for pitch decks, product launches, and social media campaigns.
Melius significantly reduces the amount of time it takes to produce a production ready promo video.
Lessons Learned
- →Locking the visual language before generating any characters prevented drift and made every downstream generation decision easier
- →The persistent visual connection between the style guide node and character generation nodes is a fundamentally better architecture than disconnected prompt sessions for collection work
- →Even simple camera push-ins and ambient motion transforms static illustrations into compelling product showcase material
- →Referencing all 6 character images simultaneously for the lineup shot produced a more cohesive group scene than attempting to describe the characters through prompts alone. Individual character can still drift, so it’s important to minimize the amount of characters included in the shot.
- →Voice generation allowed rapid iteration on narrator persona — finding the right voice for the brand without casting or recording sessions
Gallery











