OlivierDehaene
|
218c9adaa5
|
feat: decrease IPC proto size (#367)
Closes #307 #308
|
2023-05-24 19:19:57 +02:00 |
Nicolas Patry
|
db2b4e0754
|
feat(router): new healthcheck that skips the queue (#244)
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
|
2023-04-26 20:23:54 +02:00 |
OlivierDehaene
|
ebc74d5666
|
feat(router): use number of tokens in batch as input for dynamic batching (#226)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2023-04-24 17:59:00 +02:00 |
OlivierDehaene
|
343437c7b5
|
feat(router): add device and dtype info (#215)
|
2023-04-21 15:36:29 +02:00 |
OlivierDehaene
|
9987960062
|
feat(router): make router input validation optional (#164)
|
2023-04-09 20:22:27 +02:00 |
OlivierDehaene
|
610bb1f978
|
feat(benchmark): tui based benchmarking tool (#149)
|
2023-03-30 15:26:27 +02:00 |
OlivierDehaene
|
f000068944
|
feat(server): clear cache on error (#143)
|
2023-03-28 11:29:35 +02:00 |
OlivierDehaene
|
b49dbf2d88
|
fix(server): use server tokenizer as gt (#128)
|
2023-03-16 12:12:26 +01:00 |
OlivierDehaene
|
1a2d68250a
|
feat: support typical sampling (#114)
closes #112
|
2023-03-09 11:33:57 +01:00 |
OlivierDehaene
|
9b8ea6a6c7
|
feat(server): add logits watermark (#90)
|
2023-03-02 12:30:41 +01:00 |
OlivierDehaene
|
0ac184ce77
|
feat(server): add special token bool (#85)
|
2023-02-24 15:55:57 +01:00 |
OlivierDehaene
|
20c3c5940c
|
feat(router): refactor API and add openAPI schemas (#53)
|
2023-02-03 12:43:37 +01:00 |
OlivierDehaene
|
313194f6d7
|
feat(server): support repetition penalty (#47)
|
2023-02-01 15:58:42 +01:00 |
OlivierDehaene
|
017a2a8c2f
|
feat: Add token streaming using ServerSideEvents support (#41)
|
2023-01-31 17:04:00 +01:00 |
OlivierDehaene
|
54fec93193
|
fix(server): fix seeding with multiple shards (#44)
|
2023-01-31 16:01:15 +01:00 |
OlivierDehaene
|
4f9ac67cfa
|
Revert "feat: Add token streaming using ServerSideEvents support" (#40)
Reverts huggingface/text-generation-inference#36
|
2023-01-31 14:21:51 +01:00 |
OlivierDehaene
|
7fbfbb0dc5
|
feat: Add token streaming using ServerSideEvents support (#36)
Add token streaming using ServerSideEvents (SSE).
The signature of the SSE events is:
```rust
struct Details {
finish_reason: String,
generated_tokens: u32,
seed: Option<u64>,
}
struct StreamResponse {
token: Token,
generated_text: Option<String>,
details: Option<Details>,
}
struct ErrorResponse {
error: String,
}
```
|
2023-01-31 11:49:43 +01:00 |
OlivierDehaene
|
cd298bc5e5
|
feat: Support sampling seeding (#37)
Co-authored-by: Yannic Kilcher <yk@users.noreply.github.com>
|
2023-01-30 15:36:16 +01:00 |
OlivierDehaene
|
32a253063d
|
feat: Return logprobs (#8)
|
2022-12-15 17:03:56 +01:00 |
OlivierDehaene
|
718096f695
|
feat: Support stop sequences (#7)
|
2022-12-12 18:25:22 +01:00 |
OlivierDehaene
|
427d7cc444
|
feat(server): Support AutoModelForSeq2SeqLM
|
2022-11-04 18:03:04 +01:00 |
OlivierDehaene
|
c5665f5c8b
|
feat(server): Support generic AutoModelForCausalLM
|
2022-11-04 14:22:47 +01:00 |
Olivier Dehaene
|
f16f2f5ae1
|
v0.1.0
|
2022-10-20 19:14:44 +02:00 |
Olivier Dehaene
|
4c693e6524
|
Refactored gRPC interface
Added validation logic
|
2022-10-11 16:50:54 +02:00 |
Olivier Dehaene
|
295831a481
|
Init
|
2022-10-08 12:30:12 +02:00 |