Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.synheart.ai/llms.txt

Use this file to discover all available pages before exploring further.

When you’d care

Most application developers consume Syni through a language SDK — the Flutter SDK, and (later) Swift / Kotlin equivalents — and never see the C ABI. This page is for the people who do:
  • SDK authors adding a new language binding over the same engine.
  • Native integrators plugging the runtime into a non-Flutter / non-mobile surface (a CLI tool, a server worker, a desktop app).
  • Debuggers chasing symbol-level issues across the FFI boundary.

ABI rules

  • The C ABI is the stable boundary. Rust internals are not.
  • Symbols, struct shapes, error codes, and lifetimes documented here are the contract — they MUST NOT change without a major version bump on the runtime package.
  • Every handle has a _destroy counterpart. Leaks are the caller’s fault.
  • All strings are UTF-8, NUL-terminated, owned by the caller of the API unless explicitly documented otherwise.

Install

Runtime artifacts are distributed via dist.synheart.ai and installed by the Synheart CLI:
synheart install runtime syni        # just the runtime
synheart install syni                # runtime + spec (entitled subset)
synheart install runtime syni --version v0.1.0  # pin a version
The CLI drops a vendor-ready tree under your project:
synheart/vendor/syni-runtime/
├── android/
│   └── jniLibs/
│       ├── arm64-v8a/libsyni_ffi.so
│       ├── armeabi-v7a/libsyni_ffi.so
│       └── x86_64/libsyni_ffi.so
├── SyniRuntime.xcframework/         # iOS device + sim slices
├── device/libsyni_ffi.a              # iOS device archive
├── sim/libsyni_ffi.a                 # iOS sim lipo'd archive
└── headers/
    └── syni_ffi.h                    # the canonical C header
The same artifact layout is the source of truth for all language bindings.

1. Handle and lifecycle

The engine is opaque — callers hold a single SyniRuntimeHandle* for the lifetime of a load.
SyniRuntimeHandle* syni_runtime_create(const SyniRuntimeConfig* config);
SyniStatus         syni_runtime_load(SyniRuntimeHandle* h);
void               syni_runtime_destroy(SyniRuntimeHandle* h);
Lifecycle:
create  →  load  →  (generate*)  →  destroy
  • create validates config and allocates resources but does not mmap the model file.
  • load mmaps the model, materialises tokenizer + grammar, and is the expensive call (~hundreds of ms cold).
  • destroy is safe to call at any state. Subsequent calls on the handle are undefined.

2. Configuration

typedef struct {
    const char* model_path;            // absolute path to .gguf
    const char* tokenizer_path;        // absolute path to tokenizer.json
    SyniBackend backend;               // CANDLE | LLAMA
    uint32_t    n_ctx;                 // context window
    uint32_t    n_threads;             // 0 = auto
    bool        metal;                 // enable Metal on Apple platforms
    bool        cuda;                  // enable CUDA where available
} SyniRuntimeConfig;
Per-platform notes:
  • Apple: set metal = true for GPU acceleration. Falls back to CPU automatically if Metal is unavailable.
  • Android: metal / cuda are no-ops; CPU only.
  • Backend: CANDLE is the default (pure Rust, cross-compiles cleanly). LLAMA requires runtime artifacts built with the llama.cpp backend enabled — not the default published variant today.

3. Generation

Generation is request/response, optionally streamed via a callback.
typedef struct {
    const char* prompt;                // ready-made prompt
    const char* grammar_gbnf;          // optional GBNF for structured JSON
    uint32_t    max_tokens;
    float       temperature;
    float       top_p;
    int32_t     seed;                  // -1 = random
} SyniGenerateRequest;

typedef int (*SyniTokenCallback)(const char* token, void* user_data);

SyniStatus syni_generate(
    SyniRuntimeHandle* h,
    const SyniGenerateRequest* req,
    SyniTokenCallback on_token,         // NULL = no streaming
    void* user_data,
    char** out_full,                    // owned by callee; free with syni_string_free
    uint32_t* out_full_len
);

void syni_string_free(char* s);
The streaming callback returns 0 to continue, non-zero to cancel. On cancel, syni_generate returns SYNI_STATUS_CANCELLED and out_full contains the partial response so far.

4. Grammar-constrained decoding

When grammar_gbnf is non-NULL, the runtime constrains decoding to outputs valid against the GBNF. Bind grammars to schemas via the Syni Spec registry:
const char* gbnf = read_file_to_string("synheart/vendor/syni-spec/grammars/coach.response.v1.gbnf");
SyniGenerateRequest req = {
    .prompt        = "…",
    .grammar_gbnf  = gbnf,
    .max_tokens    = 512,
    .temperature   = 0.7,
};
GBNF parse failures surface as SYNI_STATUS_INVALID_GRAMMAR at request time, not at generation time.

5. Status codes

typedef enum {
    SYNI_STATUS_OK              = 0,
    SYNI_STATUS_INVALID_CONFIG  = 1,
    SYNI_STATUS_MODEL_NOT_FOUND = 2,
    SYNI_STATUS_LOAD_FAILED     = 3,
    SYNI_STATUS_INVALID_GRAMMAR = 4,
    SYNI_STATUS_OOM             = 5,
    SYNI_STATUS_CANCELLED       = 6,
    SYNI_STATUS_INTERNAL        = 7,
} SyniStatus;
Status semantics:
  • OK — success.
  • INVALID_CONFIG / MODEL_NOT_FOUND — caller bug; surface verbatim.
  • LOAD_FAILED — likely corrupted model file or unsupported architecture.
  • INVALID_GRAMMAR — GBNF didn’t parse against the schema.
  • OOM — model too large for device; suggest a smaller variant.
  • CANCELLED — caller-initiated; not an error.
  • INTERNAL — bug in the runtime; capture logs and file an issue.

6. Memory management

  • Every *_create has a matching *_destroy. Caller owns the handle.
  • Returned C strings (out_full) are heap-allocated by the runtime and must be freed with syni_string_free.
  • The runtime is thread-safe per handle: concurrent calls on the same SyniRuntimeHandle are serialised internally; concurrent calls on different handles run in parallel.

7. Backends

BackendCompile-time featureDefault in published artifacts
Candle (pure Rust)(default)
llama.cpp--features llama❌ (CI builds candle only)
The llama.cpp backend is supported for self-built runtimes — set LLAMA=1 in the runtime repo’s Makefile. The default candle backend cross-compiles cleanly to every supported target without a C++ toolchain, so that’s what the CDN ships.

Symbols this page does not document

This page covers the stable surface only. The runtime exports a number of internal symbols (_dbg_*, _unstable_*) for diagnostics — these may change without notice between patch releases.
  • Syni overview — how the runtime fits with the Flutter SDK and the spec
  • Syni Flutter SDK — the Dart wrapper over this ABI
  • Syni Spec — persona / grammar / safety contracts the runtime consumes
  • C header: synheart/vendor/syni-runtime/headers/syni_ffi.h after install