Mistral PHP Client

LLaMa.CPP

Props

Retrieve server properties and default generation settings from the Llama.cpp backend. This endpoint provides detailed configuration information about:

Model path
Chat formatting templates
Default sampling parameters (temperature, top_k, penalties, etc.)
Available tokens and samplers
Build info

Tip

Use the Props API to dynamically adapt your application's behavior based on the server's default settings or model configuration.

Code

use Partitech\PhpMistral\Clients\LlamaCpp\LlamaCppClient;

$llamacppUrl = getenv('LLAMACPP_URL');
$llamacppApiKey = getenv('LLAMACPP_API_KEY');

$client = new LlamaCppClient(apiKey: $llamacppApiKey, url: $llamacppUrl);

try {
    $response = $client->props();
    print_r($response);  // Full properties output
} catch (\Throwable $e) {
    echo $e->getMessage();
    exit(1);
}

Result

Array
(
    [default_generation_settings] => Array
        (
            [n_ctx] => 512
            [params] => Array
                (
                    [temperature] => 0.80000001192093
                    [top_k] => 40
                    [top_p] => 0.94999998807907
                    [repeat_penalty] => 1
                    ...
                )
            [chat_format] => Content-only
            [samplers] => Array
                (
                    [0] => penalties
                    [1] => dry
                    [2] => top_k
                    [3] => typ_p
                    [4] => top_p
                    [5] => min_p
                    [6] => xtc
                    [7] => temperature
                )
        )
    [total_slots] => 1
    [model_path] => /models/llama-3.2-3b-instruct-q8_0.gguf
    [chat_template] => {% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>

'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{{ '<|start_header_id|>assistant<|end_header_id|>

' }}
    [bos_token] => <|begin_of_text|>
    [eos_token] => <|eot_id|>
    [build_info] => b5002-2c3f8b85
)

Key Fields

Field	Description
`default_generation_settings`	Contains the server's default generation parameters (e.g., temperature, top_k, penalties).
`n_ctx`	Maximum context window size (in tokens).
`params`	Default inference parameters used when none are explicitly provided in a request.
`samplers`	List of active sampling strategies applied during generation.
`chat_format`	Defines how chat messages are structured for the model (e.g., Content-only, ChatML).
`model_path`	Filesystem path to the loaded model.
`chat_template`	Template used for formatting multi-turn chat conversations (e.g., system/user/assistant roles).
`bos_token`	Beginning-of-sequence token used in the chat template.
`eos_token`	End-of-sequence token used in the chat template.
`build_info`	Build version of the Llama.cpp server binary.

Note

The chat_template defines how chat messages are formatted for the model. This is critical for multi-turn conversations and ensures consistency with the model's training format.

Use Cases

Dynamic parameter adaptation: Retrieve and align your application's inference parameters with the server defaults.
Debugging and introspection: Understand how the server is configured, including the model, sampling strategies, and formatting templates.
Compatibility checks: Ensure the model and generation settings meet your application's requirements.

Caution

Modifying inference parameters without considering the defaults may lead to unexpected behaviors, especially if your model has specific training configurations.

Prompt FLow

¶Props

¶Code

¶Result

¶Key Fields

¶Use Cases

Props

Code

Result

Key Fields

Use Cases