Tags: swordow/llama-cpp-python
Tags
fix: llama_cpp embed() method bug fixes 1. n_tokens() parentheses in batch overflow offset: self._batch.n_tokens was a bound method reference, not an int; added () to call it correctly 2. normalize parameter now passed to llama_batch_decode as n_norm (n_norm=2 for L2, n_norm=0 for no normalization) instead of hardcoded 2 3. NONE pooling pos offset: pos += size * n_embd (was pos += size), since ptr is float-indexed and each token occupies n_embd floats Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mod: 1. exports llama-embedding as dll using llama_batch_decode from … …llama-embedding 2.create_embedding function adds normalize parameter for embedding normalization 3. Llama supports apply_chat_format 4. logits_all does not necessary be related with pooling type, sets True for all tokens in embed and be consistent with logic in llama-embedding.
mod: 1. exports llama-embedding as dll using llama_batch_decode from … …llama-embedding 2.create_embedding function adds normalize parameter for embedding normalization 3. Llama supports apply_chat_format 4. logits_all does not necessary be related with pooling type, sets True for all tokens in embed and be consistent with logic in llama-embedding.
fix(ci): update macos runner image to non-deprecated version
PreviousNext