RTX 3050 - Order Now
Home / Blog / Tutorials / Connect WordPress to Self-Hosted AI on GPU
Tutorials

Connect WordPress to Self-Hosted AI on GPU

Add AI content generation to your WordPress site using a self-hosted LLM on GPU. This guide covers building a custom plugin that calls your private inference endpoint for drafting posts, generating meta descriptions, and powering AI chatbots on your site.

What You’ll Connect

After this guide, your WordPress site will have AI features powered by your own GPU server — no API costs, no rate limits. A lightweight custom plugin calls your vLLM or Ollama endpoint on dedicated GPU hardware, enabling AI-assisted content drafting, automatic meta description generation, and an embedded chat widget for visitors.

The integration works by adding a REST API route within WordPress that proxies requests to your self-hosted LLM. Admin users interact with the AI via a Gutenberg sidebar panel, while front-end visitors can use an optional chatbot powered by the same GPU backend.

Custom Plugin –> wp_remote_post() –> GPU Server (vLLM) (Gutenberg panel REST route /v1/chat/ LLM inference or chatbot widget) /wp-json/ai/ completions on dedicated GPU | | Editor panel <-- Plugin returns <-- JSON parsed <-- Model completion shows result AI content by plugin returned -->

Prerequisites

Integration Steps

Create a new plugin directory in wp-content/plugins/gpu-ai-assistant/ with a main PHP file. Register a REST API endpoint under your plugin’s namespace (e.g., /wp-json/gpu-ai/v1/complete) that accepts POST requests with a prompt and system instruction.

The endpoint handler uses wp_remote_post() to forward the prompt to your GPU server’s OpenAI-compatible API. Parse the JSON response and return the completion text to the calling JavaScript. Restrict the endpoint to authenticated users with the edit_posts capability.

On the front end, add a Gutenberg sidebar panel using a JavaScript block that fetches from your REST endpoint. This gives editors an “AI Assist” button in the post editor that generates drafts, rewrites paragraphs, or creates meta descriptions based on the current post content.

Code Example

This PHP plugin registers a REST route and calls your FastAPI inference server on GigaGPU:

<?php
/**
 * Plugin Name: GPU AI Assistant
 * Description: Connect WordPress to a self-hosted LLM on GPU.
 */

defined('ABSPATH') || exit;

add_action('rest_api_init', function () {
    register_rest_route('gpu-ai/v1', '/complete', [
        'methods'  => 'POST',
        'callback' => 'gpu_ai_complete',
        'permission_callback' => function () {
            return current_user_can('edit_posts');
        },
    ]);
});

function gpu_ai_complete(WP_REST_Request $request) {
    $prompt = sanitize_textarea_field($request->get_param('prompt'));
    $system = sanitize_textarea_field($request->get_param('system') ?: 'You are a content assistant.');

    $response = wp_remote_post('https://your-gpu-server.gigagpu.com/v1/chat/completions', [
        'timeout' => 60,
        'headers' => [
            'Authorization' => 'Bearer ' . GPU_AI_API_KEY,
            'Content-Type'  => 'application/json',
        ],
        'body' => wp_json_encode([
            'model'      => 'meta-llama/Llama-3-70b-chat-hf',
            'messages'   => [
                ['role' => 'system', 'content' => $system],
                ['role' => 'user', 'content' => $prompt],
            ],
            'max_tokens' => 800,
        ]),
    ]);

    if (is_wp_error($response)) {
        return new WP_REST_Response(['error' => 'GPU server unreachable'], 502);
    }

    $body = json_decode(wp_remote_retrieve_body($response), true);
    return new WP_REST_Response(['content' => $body['choices'][0]['message']['content']]);
}

Testing Your Integration

Define GPU_AI_API_KEY in your wp-config.php to keep the key out of plugin code. Activate the plugin, then test the endpoint with a curl command or REST API client: POST /wp-json/gpu-ai/v1/complete with a JSON body containing a prompt. The response should include your model’s completion.

Test from the WordPress editor by adding a JavaScript fetch call to the endpoint. Verify that only logged-in users with the correct capability can access the route — unauthenticated requests should return a 403.

Production Tips

Cache AI responses using WordPress transients for repeated queries. If editors frequently generate meta descriptions for similar content, a cache layer reduces GPU load without affecting output quality.

For a front-end chatbot, create a separate REST endpoint that accepts visitor messages (with rate limiting) and returns AI responses. Pair this with a lightweight JavaScript widget injected via wp_enqueue_script. This turns your WordPress site into an AI chatbot platform without any third-party chat service.

The plugin pattern extends to WooCommerce stores, where AI can generate product descriptions, answer customer queries, or automate review responses. For publishers running open-source models on their own GPU, this eliminates recurring AI API costs entirely. Explore more tutorials or get started with GigaGPU to power your WordPress AI features.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?