vLLM

Generate Embeddings with vLLM Pooling API

This example shows how to use vLLM's Pooling API with PhpMistral to generate embeddings from text inputs.

The Pooling API is similar to the Embeddings API but can produce more complex outputs (including nested lists), depending on the pooling model you choose.

Tip

Use pooling models when you need flexible embeddings, like for semantic search or clustering.

Example Code

use Partitech\PhpMistral\Clients\Vllm\VllmClient;
use Partitech\PhpMistral\Exceptions\MistralClientException;

$apiKey          = getenv('VLLM_API_KEY');           // Your vLLM token
$url             = getenv('VLLM_API_EMBEDDING_URL'); // Example: "http://localhost:40002"
$embeddingModel  = getenv('VLLM_API_EMBEDDING_MODEL'); // Example: "BAAI/bge-m3"

$client = new VllmClient(apiKey: $apiKey, url: $url);

$inputs = [
    "What is the best French cheese?"
];

try {
    $response = $client->pooling($inputs, $embeddingModel);
} catch (MistralClientException $e) {
    echo $e->getMessage();
    exit(1);
}

print_r($response);

Example Output

Array
(
    [id] => pool-xxxxxxxxxxxxxxxxxxxx
    [object] => list
    [created] => 1745839100
    [model] => BAAI/bge-m3
    [data] => Array
        (
            [0] => Array
                (
                    [index] => 0
                    [object] => pooling
                    [data] => Array
                        (
                            [0] => -0.0181
                            [1] => 0.0119
                            ...
                        )
                )
        )
)

Key Points

Inputs: A list of strings.
Model: Any supported pooling model (e.g., BAAI/bge-m3).
Output: Embeddings (vectors) representing the input text.

Note

The PhpMistral library handles sending the request and returning the raw response from vLLM. The embeddings are returned as a list of floats (or nested lists depending on the model).

For more details on pooling models, check the vLLM pooling models documentation.

Prompt FLow

¶Generate Embeddings with vLLM Pooling API

¶Example Code

¶Example Output

¶Key Points

Generate Embeddings with vLLM Pooling API

Example Code

Example Output

Key Points