OlivierDehaene
|
a3b7db932f
|
fix(python-client): relax dependencies (#129)
|
2023-03-16 12:57:07 +01:00 |
OlivierDehaene
|
b49dbf2d88
|
fix(server): use server tokenizer as gt (#128)
|
2023-03-16 12:12:26 +01:00 |
OlivierDehaene
|
8ad60b752f
|
fix(server): add position ids to neox (#126)
|
2023-03-15 13:12:49 +01:00 |
OlivierDehaene
|
cbd36aa4d1
|
fix(server): revert gpt-neox optims (#123)
|
2023-03-13 22:57:08 +01:00 |
OlivierDehaene
|
6860ce9c67
|
feat: add OpenAssistant/oasst-sft-1-pythia-12b to the list of supported models (#122)
…ed models
|
2023-03-13 20:42:10 +01:00 |
OlivierDehaene
|
411d6247f4
|
v0.4.0 (#119)
|
2023-03-09 16:07:01 +01:00 |
OlivierDehaene
|
d8dc8f1b0c
|
feat(python-client): add new parameters (#118)
|
2023-03-09 16:05:33 +01:00 |
OlivierDehaene
|
55bd4fed7d
|
feat(router): add best_of parameter (#117)
|
2023-03-09 15:30:54 +01:00 |
OlivierDehaene
|
e8bfe199ba
|
feat(router): support left truncation (#115)
closes #111
|
2023-03-09 13:10:30 +01:00 |
OlivierDehaene
|
c0795de2f2
|
fix(server): do not warp prefill logits (#116)
|
2023-03-09 13:00:10 +01:00 |
OlivierDehaene
|
1a2d68250a
|
feat: support typical sampling (#114)
closes #112
|
2023-03-09 11:33:57 +01:00 |
OlivierDehaene
|
941cd42e0c
|
fix(server): fix index out of range for watermarking (#110)
|
2023-03-08 18:29:08 +01:00 |
OlivierDehaene
|
2c5df5d2af
|
fix(python-client): stream not set on the sync client (#109)
|
2023-03-08 16:48:16 +01:00 |
OlivierDehaene
|
5fd2dcb513
|
feat(launcher): default num_shard to CUDA_VISIBLE_DEVICES if possible (#108)
|
2023-03-08 13:53:41 +01:00 |
OlivierDehaene
|
0ac38d336a
|
feat(launcher): allow parsing num_shard from CUDA_VISIBLE_DEVICES (#107)
|
2023-03-08 11:06:59 +01:00 |
OlivierDehaene
|
b1485e18c5
|
fix(server): fix galactica batch (#106)
closes #105
|
2023-03-07 20:05:21 +01:00 |
OlivierDehaene
|
3fef90d50f
|
feat(clients): Python client (#103)
|
2023-03-07 18:52:22 +01:00 |
OlivierDehaene
|
0e9ed1a8c2
|
feat: add supported models (#102)
|
2023-03-07 12:55:05 +01:00 |
OlivierDehaene
|
cd5961b5da
|
feat: allow local models (#101)
closes #99
|
2023-03-06 14:39:36 +01:00 |
OlivierDehaene
|
9b205d33cc
|
fix(server): fix generate_stream by forcing tokens to be decoded correctly (#100)
|
2023-03-06 13:22:58 +01:00 |
OlivierDehaene
|
1c19b0934e
|
v0.3.2 (#97)
|
2023-03-03 18:42:20 +01:00 |
OlivierDehaene
|
0b6807caa4
|
feat(server): fix transformers commit (#96)
|
2023-03-03 17:56:27 +01:00 |
OlivierDehaene
|
240c4187fd
|
fix(launcher): add router parameters to launcher (#95)
|
2023-03-03 16:01:25 +01:00 |
OlivierDehaene
|
e3ded361b2
|
feat(ci): improve CI speed (#94)
|
2023-03-03 15:07:27 +01:00 |
OlivierDehaene
|
2d39f199ae
|
feat(server): update to hf_transfer==0.1.2 (#93)
|
2023-03-03 11:26:27 +01:00 |
OlivierDehaene
|
9b8ea6a6c7
|
feat(server): add logits watermark (#90)
|
2023-03-02 12:30:41 +01:00 |
OlivierDehaene
|
f874c47831
|
feat(router): add api-inference headers (#91)
|
2023-03-02 11:41:51 +01:00 |
OlivierDehaene
|
4e685d907e
|
feat(router): ask hf.co for pipelinetag to decide on compat_return_full_text (#89)
|
2023-02-28 10:19:32 +01:00 |
OlivierDehaene
|
21340f24ba
|
feat(router): add legacy route for api-inference support (#88)
|
2023-02-27 14:56:58 +01:00 |
OlivierDehaene
|
65e2f1624e
|
fix(server): fix token_is_special (#87)
|
2023-02-24 17:20:00 +01:00 |
OlivierDehaene
|
3b03c4ea18
|
fix(docs): fix openapi schema (#86)
|
2023-02-24 15:59:49 +01:00 |
OlivierDehaene
|
0ac184ce77
|
feat(server): add special token bool (#85)
|
2023-02-24 15:55:57 +01:00 |
OlivierDehaene
|
4b1c9720c0
|
v0.3.1 (#84)
|
2023-02-24 13:27:41 +01:00 |
OlivierDehaene
|
44ce098c10
|
feat(server): pre-allocate max attention mask (#75)
|
2023-02-24 12:49:21 +01:00 |
OlivierDehaene
|
78063c0569
|
fix(server): remove position_ids from galactica forward (#82)
closes #80
|
2023-02-20 19:28:57 +01:00 |
OlivierDehaene
|
17bc841b1b
|
feat(server): enable hf-transfer (#76)
|
2023-02-18 14:04:11 +01:00 |
OlivierDehaene
|
6796d38c6d
|
feat(router): add cors allow origin options (#73)
|
2023-02-17 18:22:00 +01:00 |
OlivierDehaene
|
c720555adc
|
v0.3.0 (#72)
|
2023-02-16 17:28:29 +01:00 |
OlivierDehaene
|
439fcaf810
|
feat(router): add prometheus metrics scrape endpoint (#71)
|
2023-02-16 17:18:53 +01:00 |
OlivierDehaene
|
7b3d460d21
|
fix(launcher): copy current env vars to subprocesses (#70)
closes #69
|
2023-02-16 11:20:23 +01:00 |
OlivierDehaene
|
5437d49beb
|
feat(router): add max_total_tokens and empty_input validation (#68)
closes #65
|
2023-02-15 21:56:59 +01:00 |
OlivierDehaene
|
68455353f5
|
feat(launcher): add disable_custom_kernels arg (#67)
|
2023-02-15 16:23:45 +01:00 |
OlivierDehaene
|
c5a4a1faf3
|
feat(server): improve download logging (#66)
|
2023-02-15 16:11:32 +01:00 |
OlivierDehaene
|
0fbc691946
|
feat: add safetensors conversion (#63)
|
2023-02-14 13:02:16 +01:00 |
OlivierDehaene
|
9af454142a
|
feat: add distributed tracing (#62)
|
2023-02-13 13:02:45 +01:00 |
Yannic Kilcher
|
e520d5b349
|
fixed SSE naming (#61)
https://en.wikipedia.org/wiki/Server-sent_events
|
2023-02-08 22:30:11 +01:00 |
OlivierDehaene
|
1ad3250b89
|
fix(docker): increase shm size (#60)
|
2023-02-08 17:53:33 +01:00 |
OlivierDehaene
|
c503a639b1
|
feat(server): support t5 (#59)
|
2023-02-07 18:25:17 +01:00 |
OlivierDehaene
|
2fe5e1b30e
|
V0.2.1 (#58)
|
2023-02-07 15:40:25 +01:00 |
OlivierDehaene
|
4acc42a605
|
fix(server): better handling of inference mode (#57)
|
2023-02-07 15:38:22 +01:00 |