Default Branch

88702d8763 · Fixing CI. (#1748) · Updated 2024-04-15 10:47:36 -06:00

Branches

81fa53f37b · Fix tests. · Updated 2024-02-08 08:25:34 -07:00

84
4

e10530d4f3 · update to peft 0.8.2 · Updated 2024-02-06 12:41:15 -07:00

84
1

4c6c39e491 · Upgrading axum=0.7 · Updated 2024-02-02 04:15:32 -07:00

86
1

c10c4a023a · Github magic? · Updated 2024-02-01 09:13:37 -07:00

86
7

1e03b61b5c · Revert "Modify default for max_new_tokens in python client (#1336)" · Updated 2024-02-01 07:36:10 -07:00

88
0
Included

b71557d956 · add request id in logs · Updated 2024-02-01 05:47:04 -07:00

90
1

94d243b3d7 · Freshen up the README. · Updated 2024-02-01 02:23:37 -07:00

90
0
Included

a3c45da0a4 · Improvments within mamba. · Updated 2024-01-31 03:28:58 -07:00

115
7

871e5e7338 · fix rotary dim · Updated 2024-01-29 20:53:08 -07:00

97
2

fc86dba781 · fix: ensure latest requirements exported · Updated 2024-01-26 09:26:34 -07:00

100
2

45978034c9 · Pre-emptive on sealion. · Updated 2024-01-26 02:15:31 -07:00

108
1

31b23f98ff · feat: boilerplate phi2 model integration · Updated 2024-01-10 07:42:26 -07:00

122
1

65db02f192 · fix: use TORCH_NCCL_AVOID_RECORD_STREAMS=1 · Updated 2024-01-09 09:59:16 -07:00

122
1

5b340a5ffd · Dump work. · Updated 2023-11-30 15:05:51 -07:00

155
8

8a7771a33c · Use cuda devel instead · Updated 2023-10-11 02:40:06 -06:00

182
2

56de96abe9 · missing arg · Updated 2023-10-05 07:14:17 -06:00

185
3
dev

c35f39cf83 · Add AWQ quantization inference support (#1019) · Updated 2023-09-25 01:58:02 -06:00

217
1

33958e0989 · Start. · Updated 2023-09-11 12:25:49 -06:00

222
1

4bac76241d · Update server.rs · Updated 2023-08-21 02:10:57 -06:00

246
5

cf43528538 · remove stream since its a separate PR · Updated 2023-08-18 04:57:36 -06:00

248
6