Commit Graph

12 Commits

Author SHA1 Message Date
OlivierDehaene 50b495f3d8
feat: add more latency metrics in forward (#1346) 2023-12-14 15:59:38 +01:00
OlivierDehaene 895c5f1562
feat(server): only compute prefill logprobs when asked (#406)
Close #288
2023-06-02 17:12:30 +02:00
OlivierDehaene 62f91f78ac
feat(server): support vectorized warpers in flash causal lm (#317)
Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
2023-05-26 12:30:27 +02:00
OlivierDehaene 9987960062
feat(router): make router input validation optional (#164) 2023-04-09 20:22:27 +02:00
OlivierDehaene b49dbf2d88
fix(server): use server tokenizer as gt (#128) 2023-03-16 12:12:26 +01:00
OlivierDehaene 3fef90d50f
feat(clients): Python client (#103) 2023-03-07 18:52:22 +01:00
OlivierDehaene b1482d9048
breaking(router): modify /generate API to only return generated text (#50)
@njhill, @yk FYI

generated_text was concatenated to the user prompt for legacy reason. We
want to remove this behaviour as we don't think it is useful and even
detrimonial to usability.

We also remove the unused Vec.
2023-02-02 15:02:04 +01:00
OlivierDehaene 017a2a8c2f
feat: Add token streaming using ServerSideEvents support (#41) 2023-01-31 17:04:00 +01:00
OlivierDehaene 4f9ac67cfa
Revert "feat: Add token streaming using ServerSideEvents support" (#40)
Reverts huggingface/text-generation-inference#36
2023-01-31 14:21:51 +01:00
OlivierDehaene 7fbfbb0dc5
feat: Add token streaming using ServerSideEvents support (#36)
Add token streaming using ServerSideEvents (SSE).

The signature of the SSE events is: 

```rust
struct Details {
    finish_reason: String,
    generated_tokens: u32,
    seed: Option<u64>,
}

struct StreamResponse {
    token: Token,
    generated_text: Option<String>,
    details: Option<Details>,
}

struct ErrorResponse {
    error: String,
}
```
2023-01-31 11:49:43 +01:00
OlivierDehaene 1f570d181f
fix(server): Fix position ids (#28) 2023-01-20 15:35:22 +01:00
OlivierDehaene 15511edc01
feat(server): Support SantaCoder (#26) 2023-01-20 12:24:39 +01:00