LLaMa.CPP

Fill In the Middle (FIM) API

The Fill In the Middle (FIM) API allows you to generate text that completes the middle part of a prompt. Instead of generating a continuation, this method fills the gap between a given prefix and suffix. This is particularly useful for tasks such as:

  • Code completion (inserting a missing block)
  • Document editing (filling in missing sections)
  • Sentence completion

FIM without streaming

In non-streaming mode, the FIM response is returned once the entire middle section is generated.

PHP Code

use Partitech\PhpMistral\Clients\LlamaCpp\LlamaCppClient;
use Partitech\PhpMistral\Message;

$llamacppUrl = getenv('LLAMACPP_URL');
$llamacppApiKey = getenv('LLAMACPP_API_KEY');

$client = new LlamaCppClient(apiKey: $llamacppApiKey, url: $llamacppUrl);

// Define the prefix and suffix
$prompt  = "Write response in php:\n";
$prompt .= "/** Calculate date + n days. Returns \DateTime object */";
$suffix  = 'return $datePlusNdays;\n}';

try {
    $result = $client->fim(
        params:[
            'input_prefix' => $prompt,
            'input_suffix' => $suffix,
            'temperature' => 0.7,
            'top_p' => 1,
            'max_tokens' => 200,
            'min_tokens' => 0,
            'stop' => 'string',  // Optional stopping criteria
            'random_seed' => 0   // Ensures reproducibility
        ]
    );

    print_r($result->getMessage());  // The generated middle section
    
} catch (\Throwable $e) {
    echo $e->getMessage();
    exit(1);
}

Example Output

function getDatePlusNDays($date, $n) {\n    $datePlusNdays = new \DateTime($date);\n    $datePlusNdays->add(new \DateInterval('P' . $n . 'D'));\n    return $datePlusNdays;\n}

FIM with streaming

In streaming mode, the middle content is returned incrementally, allowing you to process or display the result as it is generated.

PHP Code

use Partitech\PhpMistral\Clients\LlamaCpp\LlamaCppClient;
use Partitech\PhpMistral\Message;

$llamacppUrl = getenv('LLAMACPP_URL');
$llamacppApiKey = getenv('LLAMACPP_API_KEY');

$client = new LlamaCppClient(apiKey: $llamacppApiKey, url: $llamacppUrl);

// Define the prefix and suffix
$prompt  = "Write response in php:\n";
$prompt .= "/** Calculate date + n days. Returns \DateTime object */";
$suffix  = 'return $datePlusNdays;\n}';

try {
    $result = $client->fim(
        params:[
            'input_prefix' => $prompt,
            'input_suffix' => $suffix,
            'temperature' => 0.1,  // Lower temperature for more deterministic output
            'top_p' => 1,
            'max_tokens' => 200,
            'min_tokens' => 0,
            'stop' => 'string',
            'random_seed' => 0
        ],
        stream: true  // Enable streaming mode
    );

    /** @var Message $chunk */
    foreach ($result as $chunk) {
        echo $chunk->getChunk();  // Output each chunk as it arrives
    }

} catch (\Throwable $e) {
    echo $e->getMessage();
    exit(1);
}

Example Output

function calculateDatePlusNDays($startDate, $nDays) {\n    $date = new \DateTime($startDate);\n    $date->add(new \DateInterval('P' . $nDays . 'D'));\n    $datePlusNdays = $date;\n    return $datePlusNdays;\n}

Parameters

Parameter Type Description
input_prefix string The text before the missing middle section.
input_suffix string The text after the missing middle section.
temperature float Controls randomness (lower = more deterministic).
top_p float Nucleus sampling; restricts generation to the top probability tokens.
max_tokens int Maximum tokens to generate for the middle section.
min_tokens int Minimum tokens to generate (0 by default).
stop string Optional stop sequence (generation halts if matched).
random_seed int Ensures reproducible outputs by setting the seed for random generation.
stream bool Enables streaming mode (chunks of output).

Use Cases

  • Code completion: Insert missing code between known start and end sections.
  • Document editing: Fill in gaps within structured documents.
  • Interactive tools: Provide suggestions for the middle part of sentences or paragraphs.