Commit Graph

23 Commits

Author SHA1 Message Date
OlivierDehaene f000068944
feat(server): clear cache on error (#143) 2023-03-28 11:29:35 +02:00
OlivierDehaene ab5fd8cf93
v0.4.1 (#140) 2023-03-26 16:37:51 +02:00
OlivierDehaene 411d6247f4
v0.4.0 (#119) 2023-03-09 16:07:01 +01:00
OlivierDehaene 1c19b0934e
v0.3.2 (#97) 2023-03-03 18:42:20 +01:00
OlivierDehaene 4b1c9720c0
v0.3.1 (#84) 2023-02-24 13:27:41 +01:00
OlivierDehaene c720555adc
v0.3.0 (#72) 2023-02-16 17:28:29 +01:00
OlivierDehaene 9af454142a
feat: add distributed tracing (#62) 2023-02-13 13:02:45 +01:00
OlivierDehaene 2fe5e1b30e
V0.2.1 (#58) 2023-02-07 15:40:25 +01:00
OlivierDehaene 20c3c5940c
feat(router): refactor API and add openAPI schemas (#53) 2023-02-03 12:43:37 +01:00
OlivierDehaene 017a2a8c2f
feat: Add token streaming using ServerSideEvents support (#41) 2023-01-31 17:04:00 +01:00
OlivierDehaene 4f9ac67cfa
Revert "feat: Add token streaming using ServerSideEvents support" (#40)
Reverts huggingface/text-generation-inference#36
2023-01-31 14:21:51 +01:00
OlivierDehaene 7fbfbb0dc5
feat: Add token streaming using ServerSideEvents support (#36)
Add token streaming using ServerSideEvents (SSE).

The signature of the SSE events is: 

```rust
struct Details {
    finish_reason: String,
    generated_tokens: u32,
    seed: Option<u64>,
}

struct StreamResponse {
    token: Token,
    generated_text: Option<String>,
    details: Option<Details>,
}

struct ErrorResponse {
    error: String,
}
```
2023-01-31 11:49:43 +01:00
OlivierDehaene cd298bc5e5
feat: Support sampling seeding (#37)
Co-authored-by: Yannic Kilcher <yk@users.noreply.github.com>
2023-01-30 15:36:16 +01:00
OlivierDehaene 32a253063d
feat: Return logprobs (#8) 2022-12-15 17:03:56 +01:00
OlivierDehaene 718096f695
feat: Support stop sequences (#7) 2022-12-12 18:25:22 +01:00
OlivierDehaene 3cf6368c77 feat(server): Support all AutoModelForCausalLM on a best effort basis 2022-10-28 19:24:00 +02:00
OlivierDehaene 09674e6df9 feat(server): Support bitsandbytes 2022-10-27 14:25:29 +02:00
OlivierDehaene beb552127a feat(client): Simplify sharded logic 2022-10-22 23:40:05 +02:00
Olivier Dehaene f16f2f5ae1 v0.1.0 2022-10-20 19:14:44 +02:00
Olivier Dehaene 5e5d8766a2 feat: Improve error handling 2022-10-17 14:59:00 +02:00
Olivier Dehaene 39df4d9975 Use axum 2022-10-11 18:14:39 +02:00
Olivier Dehaene 4c693e6524 Refactored gRPC interface
Added validation logic
2022-10-11 16:50:54 +02:00
Olivier Dehaene 295831a481 Init 2022-10-08 12:30:12 +02:00