hf_text-generation-inference

Commit Graph

Author	SHA1	Message	Date
drbh	3011639ff7	Revert "Unroll notify error into generate response" (#2605 ) Revert "Unroll notify error into generate response (#2597)" This reverts commit `d22b0c1fbe`.	2024-10-03 17:56:40 -04:00
drbh	d22b0c1fbe	Unroll notify error into generate response (#2597 ) * feat: unroll notify_error if no tool is choosen * fix: expect simple message when no tool is selected * fix: improve test to avoid notify_error * fix: improve docs and indicate change in expected response * fix: adjust linting in test file	2024-10-02 11:34:57 -04:00
drbh	93a7042d7e	feat: support phi3.5 moe (#2479 ) * feat: support phi3.5 moe model loading * fix: prefer llama base model and improve rotary logic * feat: return reasonable generation and add integration test * fix: run lint and update docs * fix: rerun lint for openapi docs * fix: prefer do_sample false unless temp is set by user, and update chat tests * fix: small typo adjustments * fix: consolidate long rope paths * fix: revert greedy by default and test changes * Vendor configuration so that we don't have to `trust_remote_code` * Use SparseMoELayer * Add support for dense MoE * Some type annotations * Add the usual model tests * Ruff. --------- Co-authored-by: Daniël de Kok <me@danieldk.eu> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2024-09-30 11:15:09 +02:00
drbh	cfa73b5c99	Pr 2451 ci branch (#2454 ) * fix[router]: Fix tools not passed in chat template Signed-off-by: GitHub <noreply@github.com> * feat: improve default tool serialization and lints * feat: refactor tool logic to include notify_error in prompt and adjust typing * fix: adjust non tool template apply * fix: simplify tool grammar logic and improve schema * feat: avoid skip tool test and avoid empty tool prompts * fix: increase test client timeout for grammar compilation tests --------- Signed-off-by: GitHub <noreply@github.com> Co-authored-by: Simone Rossi <simone.rossi.93@gmail.com>	2024-08-26 20:19:38 -04:00
drbh	bab02ff2bc	feat: add ruff and resolve issue (#2262 ) * feat: add ruff and resolve issue * fix: update client exports and adjust after rebase * fix: adjust syntax to avoid circular import * fix: adjust client ruff settings * fix: lint and refactor import check and avoid model enum as global names * fix: improve fbgemm_gpu check and lints * fix: update lints * fix: prefer comparing model enum over str * fix: adjust lints and ignore specific rules * fix: avoid unneeded quantize check	2024-07-26 10:29:09 -04:00
drbh	7276d43495	feat: improve tools to include name and add tests (#1693 ) This PR makes tool calling aware of the name of the function selected. Fixes: https://github.com/huggingface/text-generation-inference/issues/1657 Thank you @puppetm4st3r for the helpful snippets, large parts of this PR are simply refactors of the code shared 🙏 **opening draft PR because small tweaks are needed before merging	2024-04-16 09:02:46 -04:00
drbh	de6cb15fa5	fix: improve tool type, bump pydantic and outlines (#1650 ) This PR resolves a couple - [X] adjusts the tool response to align with openai's tools response type - [X] bumps pydantic to `2.6.4` in all apps (resolves dependency issue when running tests) - [X] bump `outlines` version and fix import for new name	2024-03-21 12:45:56 -04:00
drbh	7dbaf9e901	fix: correctly index into mask when applying grammar (#1618 ) This PR fixes how the grammar mask is index when generating text and adds a new test to ensure the grammars work with non flash models	2024-03-01 18:22:01 +01:00
drbh	9b6db5f793	Support tools (#1587 ) This work in progress PR begins to add support for tools. Tools relies on grammar support and still has some unsolved challenges. Opening the PR for visibility and feedback	2024-02-28 11:10:27 +01:00

9 Commits