Building an Agent-Native System for Architectural Renderings

Pure Blue Fish is preparing to build a closed-technology fish farming facility in the US. The company approached me to help create renderings for the architectural plan with a specific design style.

There’s a wide context here beyond the sketch itself: who the company is and their messaging, the technology, design characteristics, and the building plan itself includes precise features and dimensions.

Preliminary Work with Claude

I planned to use Nano Banana Pro to create the renderings, but first I did preliminary work with Claude:

Research on the company to understand the story and technology
Analysis of the sketches to understand dimensions
Design characterization in a technological, environmental, and blue style as requested by the company
Creating a summary document with dimensions and design style

I transferred the final document to Claude Code and asked it to use the image creation skill to create renderings. It understood and created examples for me, but clearly there’s room for improvement and the company will want to make changes.

The Problem

The employees aren’t technical and I wanted to enable them to create renderings easily.

I built a simple system with Claude that gives Gemini Pro the full context: company information, design characterization, facility dimensions, and sketches.

I made several attempts but it wasn’t good enough. I also understood that the broad context overloads the model and hurts accuracy.

The Solution: Agent-Native System

I talked with Claude and we planned an Agent-Native system for creating the renderings. A system where the AI decides which information is relevant for each request:

Gemini-based agent that receives the context: company information, design characterization, and building dimensions.
The user describes to the agent the image they want to create - direction, angle, specific details.
The agent uses its context, asks if there are missing details in the request, and creates the appropriate prompt without overloading the model with irrelevant information.
The user can change the context through the UI (new dimensions, new design style) without code.

The Advantage

Instead of sending too much information with every request - all dimensions, characteristics, elements - for each image Gemini creates an appropriate prompt with just the required information.

Additionally, if there are special requests the user can ask to include them in the prompt, for example to create a series of images from a special angle.

Additional Touches

Claude also used its frontend design skill to adapt the design to the company, helped me write usage instructions and build an inspiration page with angles and examples.

Expansion: Video Creation with Veo 3.1

I added video creation with Google’s Veo 3.1 to the rendering system. The connection happens through the API - I asked Claude Code to learn from the official documentation how to do it, in plan mode of course.

Now the system includes:

Embedded and updatable context and sketches
Prompt assistant
Image creation
Video creation from generated images

All in one interface tailored to specific needs.

In the end Claude also built itself a skill, so it can build similar things easily or create videos directly.

My Approach

I always think - not “how will AI do this for me”, but “how will AI help me build the right solution”?

You never know where it will end up :)