What You’ll Connect
After this guide, your WordPress site will have AI features powered by your own GPU server — no API costs, no rate limits. A lightweight custom plugin calls your vLLM or Ollama endpoint on dedicated GPU hardware, enabling AI-assisted content drafting, automatic meta description generation, and an embedded chat widget for visitors.
The integration works by adding a REST API route within WordPress that proxies requests to your self-hosted LLM. Admin users interact with the AI via a Gutenberg sidebar panel, while front-end visitors can use an optional chatbot powered by the same GPU backend.
Custom Plugin –> wp_remote_post() –> GPU Server (vLLM) (Gutenberg panel REST route /v1/chat/ LLM inference or chatbot widget) /wp-json/ai/ completions on dedicated GPU | | Editor panel <-- Plugin returns <-- JSON parsed <-- Model completion shows result AI content by plugin returned -->Prerequisites
- A GigaGPU server with a running LLM behind an OpenAI-compatible API (self-host guide)
- A WordPress site (self-hosted, version 5.0+ with Gutenberg editor)
- HTTPS access to your GPU endpoint (Nginx reverse proxy guide)
- Basic PHP development knowledge for writing a simple plugin
- API key for your inference server (security guide)
Integration Steps
Create a new plugin directory in wp-content/plugins/gpu-ai-assistant/ with a main PHP file. Register a REST API endpoint under your plugin’s namespace (e.g., /wp-json/gpu-ai/v1/complete) that accepts POST requests with a prompt and system instruction.
The endpoint handler uses wp_remote_post() to forward the prompt to your GPU server’s OpenAI-compatible API. Parse the JSON response and return the completion text to the calling JavaScript. Restrict the endpoint to authenticated users with the edit_posts capability.
On the front end, add a Gutenberg sidebar panel using a JavaScript block that fetches from your REST endpoint. This gives editors an “AI Assist” button in the post editor that generates drafts, rewrites paragraphs, or creates meta descriptions based on the current post content.
Code Example
This PHP plugin registers a REST route and calls your FastAPI inference server on GigaGPU:
<?php
/**
* Plugin Name: GPU AI Assistant
* Description: Connect WordPress to a self-hosted LLM on GPU.
*/
defined('ABSPATH') || exit;
add_action('rest_api_init', function () {
register_rest_route('gpu-ai/v1', '/complete', [
'methods' => 'POST',
'callback' => 'gpu_ai_complete',
'permission_callback' => function () {
return current_user_can('edit_posts');
},
]);
});
function gpu_ai_complete(WP_REST_Request $request) {
$prompt = sanitize_textarea_field($request->get_param('prompt'));
$system = sanitize_textarea_field($request->get_param('system') ?: 'You are a content assistant.');
$response = wp_remote_post('https://your-gpu-server.gigagpu.com/v1/chat/completions', [
'timeout' => 60,
'headers' => [
'Authorization' => 'Bearer ' . GPU_AI_API_KEY,
'Content-Type' => 'application/json',
],
'body' => wp_json_encode([
'model' => 'meta-llama/Llama-3-70b-chat-hf',
'messages' => [
['role' => 'system', 'content' => $system],
['role' => 'user', 'content' => $prompt],
],
'max_tokens' => 800,
]),
]);
if (is_wp_error($response)) {
return new WP_REST_Response(['error' => 'GPU server unreachable'], 502);
}
$body = json_decode(wp_remote_retrieve_body($response), true);
return new WP_REST_Response(['content' => $body['choices'][0]['message']['content']]);
}
Testing Your Integration
Define GPU_AI_API_KEY in your wp-config.php to keep the key out of plugin code. Activate the plugin, then test the endpoint with a curl command or REST API client: POST /wp-json/gpu-ai/v1/complete with a JSON body containing a prompt. The response should include your model’s completion.
Test from the WordPress editor by adding a JavaScript fetch call to the endpoint. Verify that only logged-in users with the correct capability can access the route — unauthenticated requests should return a 403.
Production Tips
Cache AI responses using WordPress transients for repeated queries. If editors frequently generate meta descriptions for similar content, a cache layer reduces GPU load without affecting output quality.
For a front-end chatbot, create a separate REST endpoint that accepts visitor messages (with rate limiting) and returns AI responses. Pair this with a lightweight JavaScript widget injected via wp_enqueue_script. This turns your WordPress site into an AI chatbot platform without any third-party chat service.
The plugin pattern extends to WooCommerce stores, where AI can generate product descriptions, answer customer queries, or automate review responses. For publishers running open-source models on their own GPU, this eliminates recurring AI API costs entirely. Explore more tutorials or get started with GigaGPU to power your WordPress AI features.