Default Branch

780531ec77 · chore: prepare 2.4.1 release (#2773) · Updated 2024-11-22 10:26:15 -07:00

Branches

81fa53f37b · Fix tests. · Updated 2024-02-08 08:25:34 -07:00

597
4

e10530d4f3 · update to peft 0.8.2 · Updated 2024-02-06 12:41:15 -07:00

597
1

c10c4a023a · Github magic? · Updated 2024-02-01 09:13:37 -07:00

599
7

1e03b61b5c · Revert "Modify default for max_new_tokens in python client (#1336)" · Updated 2024-02-01 07:36:10 -07:00

601
0
Included

b71557d956 · add request id in logs · Updated 2024-02-01 05:47:04 -07:00

603
1

94d243b3d7 · Freshen up the README. · Updated 2024-02-01 02:23:37 -07:00

603
0
Included

a3c45da0a4 · Improvments within mamba. · Updated 2024-01-31 03:28:58 -07:00

628
7

871e5e7338 · fix rotary dim · Updated 2024-01-29 20:53:08 -07:00

610
2

fc86dba781 · fix: ensure latest requirements exported · Updated 2024-01-26 09:26:34 -07:00

613
2

45978034c9 · Pre-emptive on sealion. · Updated 2024-01-26 02:15:31 -07:00

621
1

31b23f98ff · feat: boilerplate phi2 model integration · Updated 2024-01-10 07:42:26 -07:00

635
1

65db02f192 · fix: use TORCH_NCCL_AVOID_RECORD_STREAMS=1 · Updated 2024-01-09 09:59:16 -07:00

635
1

5b340a5ffd · Dump work. · Updated 2023-11-30 15:05:51 -07:00

668
8

8a7771a33c · Use cuda devel instead · Updated 2023-10-11 02:40:06 -06:00

695
2

56de96abe9 · missing arg · Updated 2023-10-05 07:14:17 -06:00

698
3
dev

c35f39cf83 · Add AWQ quantization inference support (#1019) · Updated 2023-09-25 01:58:02 -06:00

730
1

33958e0989 · Start. · Updated 2023-09-11 12:25:49 -06:00

735
1

4bac76241d · Update server.rs · Updated 2023-08-21 02:10:57 -06:00

759
5

cf43528538 · remove stream since its a separate PR · Updated 2023-08-18 04:57:36 -06:00

761
6

89a4e723d2 · Attempting to fix torch leak. · Updated 2023-08-12 01:06:49 -06:00

774
1