You will build a content generation pipeline where you provide a topic and get back a complete blog post with a matching feature image — the LLM writes the article while SDXL generates the header image, both running in parallel on the same GPU. The end result: a marketing team submits “GPU hosting for AI startups” and receives a 600-word article plus a 1024×1024 feature image within 30 seconds. Here is the full pipeline on dedicated GPU infrastructure.
Pipeline Architecture
| Stage | Tool | Output | VRAM |
|---|---|---|---|
| 1a. Article generation | LLaMA 3.1 70B (Q4) | Structured blog post | ~38GB |
| 1b. Image prompt | LLaMA 3.1 8B | SDXL-optimised prompt | ~6GB |
| 2. Image generation | SDXL 1.0 | Feature image PNG | ~7GB |
| 3. Assembly | Python | Combined output | CPU |
With sequential execution, a 24GB GPU handles this using the 8B model for both text and image prompts. For parallel execution, use a 48GB+ GPU or run text and image on separate servers.
Article Generation
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="none")
def generate_article(topic: str, tone: str = "professional", length: int = 600) -> dict:
response = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{
"role": "system",
"content": f"You are a content writer. Write a {tone} blog post "
f"of approximately {length} words. Include an engaging title, "
f"introduction, 3-4 sections with H2 headings, and a conclusion. "
f"Return as JSON: {{\"title\": \"\", \"meta_description\": \"\", "
f"\"content\": \"(HTML with h2 tags)\", \"image_concept\": \"\"}}"
}, {"role": "user", "content": f"Write about: {topic}"}],
max_tokens=2000, temperature=0.7
)
return parse_json(response.choices[0].message.content)
The model returns an image_concept field describing what the feature image should depict — this feeds the image generation stage. Deploy via vLLM for efficient serving.
Image Prompt Engineering
def create_sdxl_prompt(image_concept: str) -> str:
response = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{
"role": "system",
"content": "Convert this image concept into an SDXL prompt. "
"Include style keywords: professional, modern, high quality, "
"4k, clean composition. Keep under 75 tokens."
}, {"role": "user", "content": image_concept}],
max_tokens=100, temperature=0.5
)
return response.choices[0].message.content
LLMs produce better SDXL prompts than raw topic descriptions because they add the stylistic keywords that SDXL responds to.
Image Generation
import torch
from diffusers import StableDiffusionXLPipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16
).to("cuda")
def generate_feature_image(prompt: str, output_path: str):
image = pipe(
prompt=prompt,
negative_prompt="text, watermark, blurry, low quality",
width=1024, height=576, # 16:9 for blog headers
num_inference_steps=30
).images[0]
image.save(output_path)
return output_path
Orchestrated Pipeline
from fastapi import FastAPI
app = FastAPI()
@app.post("/generate-content")
async def generate_content(request: dict):
topic = request["topic"]
# Stage 1: Generate article (includes image concept)
article = generate_article(topic, request.get("tone", "professional"))
# Stage 2: Generate matching image
sdxl_prompt = create_sdxl_prompt(article["image_concept"])
image_path = generate_feature_image(sdxl_prompt, f"/output/{slugify(topic)}.png")
return {
"title": article["title"],
"meta_description": article["meta_description"],
"content": article["content"],
"image_url": image_path,
"image_prompt": sdxl_prompt
}
Quality and Brand Control
For production content pipelines: add brand voice guidelines to the system prompt with examples of approved writing style; maintain a negative prompt library for SDXL to enforce brand-consistent imagery; implement a review queue where editors approve content before publishing; integrate with ComfyUI for advanced image workflows with LoRA style adapters; and use RAG retrieval from existing published content to maintain consistency. Store content in a CMS-ready format. Teams should review AI-generated content before publication. See model options for larger models that produce higher-quality text, private hosting for brand-sensitive content, and more tutorials for related workflows. Check content use cases for industry examples.
Content AI GPU Servers
Dedicated GPU servers for combined text and image generation. Run LLMs and SDXL on isolated UK infrastructure.
Browse GPU Servers