OlivierDehaene
|
7b870e1e18
|
feat(router): use background task to manage request queue (#52)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2023-02-02 14:59:27 +01:00 |
OlivierDehaene
|
775115e3a5
|
feat(server): allow the server to use a local weight cache (#49)
|
2023-02-01 16:22:10 +01:00 |
OlivierDehaene
|
f830706b21
|
feat(server): Support GPT-Neox (#39)
|
2023-01-31 18:53:56 +01:00 |
OlivierDehaene
|
017a2a8c2f
|
feat: Add token streaming using ServerSideEvents support (#41)
|
2023-01-31 17:04:00 +01:00 |
OlivierDehaene
|
4f9ac67cfa
|
Revert "feat: Add token streaming using ServerSideEvents support" (#40)
Reverts huggingface/text-generation-inference#36
|
2023-01-31 14:21:51 +01:00 |
OlivierDehaene
|
7fbfbb0dc5
|
feat: Add token streaming using ServerSideEvents support (#36)
Add token streaming using ServerSideEvents (SSE).
The signature of the SSE events is:
```rust
struct Details {
finish_reason: String,
generated_tokens: u32,
seed: Option<u64>,
}
struct StreamResponse {
token: Token,
generated_text: Option<String>,
details: Option<Details>,
}
struct ErrorResponse {
error: String,
}
```
|
2023-01-31 11:49:43 +01:00 |
OlivierDehaene
|
15511edc01
|
feat(server): Support SantaCoder (#26)
|
2023-01-20 12:24:39 +01:00 |
Nick Hill
|
e6d3eb5d5d
|
fix(server): Minor refactorization using new_zeros (#24)
- Fix some type hints, in particular base tokenizer class
- Make use of `tensor.new_zero/empty` methods
- Simplify env var string parsing in launcher
|
2023-01-17 09:10:22 +01:00 |
OlivierDehaene
|
fcc2c5fcbf
|
feat(launcher): Log server stdout (#19)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2023-01-05 12:01:23 +01:00 |
OlivierDehaene
|
611e21cb13
|
fix(server): Fix stop sequences (#11)
|
2022-12-16 16:03:39 +01:00 |
OlivierDehaene
|
3e2e6240b8
|
feat(launcher): Add integration tests (#9)
|
2022-12-16 11:29:36 +01:00 |
OlivierDehaene
|
4236e41b0d
|
feat(server): Improved doc
|
2022-11-07 12:53:56 +01:00 |
OlivierDehaene
|
cea6051eff
|
feat(launcher): Pass CUDA_VISIBLE_DEVICES to the shard
|
2022-11-04 18:31:08 +01:00 |
OlivierDehaene
|
b3b7ea0d74
|
feat: Use json formatter by default in docker image
|
2022-11-02 17:29:56 +01:00 |
OlivierDehaene
|
3cf6368c77
|
feat(server): Support all AutoModelForCausalLM on a best effort basis
|
2022-10-28 19:24:00 +02:00 |
OlivierDehaene
|
09674e6df9
|
feat(server): Support bitsandbytes
|
2022-10-27 14:25:29 +02:00 |
Nicolas Patry
|
c8ce9b2515
|
feat(server): Use safetensors
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
|
2022-10-22 20:00:15 +02:00 |
OlivierDehaene
|
c837893370
|
feat(router): Add max_waiting_tokens
|
2022-10-21 16:40:05 +02:00 |
Olivier Dehaene
|
f16f2f5ae1
|
v0.1.0
|
2022-10-20 19:14:44 +02:00 |