Mistral PHP Client

vLLM

Completion

The completion method is designed for simple text generation tasks. Unlike the chat method, which uses a structured conversation format with roles (e.g., user, assistant), completion takes a plain text prompt and generates a continuation.

Note

This method is ideal for tasks like text summarization, code generation, or completing sentences without any conversational context.

Without streaming

In non-streaming mode, the response is returned as a complete block once the generation is finished.

$client = new VllmClient(apiKey: $apiKey, url:  $url);

$params = [
    'model' => $model,
    'temperature' => 0.7,
    'top_p' => 1,
    'max_tokens' => 1000,
    'seed' => 15,
];

try {
    $chatResponse = $client->completion(
        prompt: 'The ingredients that make up dijon mayonnaise are ',
        params: $params
    );
    print_r($chatResponse->getMessage());
    print_r($chatResponse->getUsage());
} catch (\Throwable $e) {
    echo $e->getMessage();
    exit(1);
}

Example output

The ingredients that make up dijon mayonnaise are egg yolks, Dijon mustard, lemon juice or vinegar, and oil (such as vegetable or olive oil). Seasonings like salt and pepper can be added to taste.

Array
(
    [prompt_tokens] => 12
    [completion_tokens] => 28
    [total_tokens] => 40
)

Tip

Use the $chatResponse->getUsage() method to monitor token consumption and manage quotas efficiently.

With streaming

In streaming mode, the generated text is sent in chunks as it is produced. This is particularly useful for large outputs or for providing real-time feedback to users.

$client = new VllmClient(apiKey: $apiKey, url:  $url);

try {
    foreach ($client->completion(prompt: 'Explain step by step how to make dijon mayonnaise ', params: $params, stream: true) as $chunk) {
        echo $chunk->getChunk();
    }
} catch (\Throwable $e) {
    echo $e->getMessage();
    exit(1);
}

Example output (streamed chunks)

Explain step by step how to make dijon mayonnaise:
1. In a bowl, whisk together the egg yolks and Dijon mustard until smooth.
2. Slowly add oil in a thin stream while continuously whisking.
3. Once the mixture thickens, add lemon juice or vinegar.
4. Season with salt and pepper to taste.
5. Continue whisking until fully emulsified.

Important

With streaming enabled, each chunk represents a partial output. You are responsible for assembling or displaying these chunks as needed.

Note

The streaming mode reduces latency for long completions, as data is sent progressively rather than waiting for the entire output.

Warning

Make sure your $url and $apiKey point to a valid vLLM server endpoint. Incorrect configurations may result in connection errors.

Prompt FLow

¶Completion

¶Without streaming

¶Example output

¶With streaming

¶Example output (streamed chunks)

Completion

Without streaming

Example output

With streaming

Example output (streamed chunks)