Default Branch

3c54488638 · nix: downgrade to outlines 0.1.3 (#2768) · Updated 2024-11-21 05:00:26 -07:00

Branches

56e3b65c46 · Add a README section about using Nix · Updated 2024-11-21 02:00:26 -07:00

1
1

5335bf973b · feat(backend): multistream inference on CPU · Updated 2024-11-20 16:03:05 -07:00

23
65

74a8a820ad · Use FP8 KV cache when specified by compressed-tensors · Updated 2024-11-20 07:25:50 -07:00

5
1

fa577c9be2 · fix: remove continue_final_message chat request param · Updated 2024-11-19 14:24:18 -07:00

24
4

d2581ed606 · feat(backend): remove all the logs from hardware.hpp · Updated 2024-11-18 16:19:22 -07:00

15
5

0fd2ab3e89 · fix: remove unused deps and imports · Updated 2024-11-18 14:48:09 -07:00

10
12

53b6f6e604 · Apply suggestions from code review · Updated 2024-11-18 04:28:07 -07:00

14
8

627862c11a · Updated the flops calculation (checked with fvcore). · Updated 2024-11-11 06:31:32 -07:00

29
10

6297f1769f · feat: add payload limit · Updated 2024-11-05 08:38:21 -07:00

25
1

a604bfe450 · fix: run pre commit lints · Updated 2024-11-01 10:11:57 -06:00

33
2

3bb78a8266 · misc(deps): update ompi from 4.1.6 to 4.1.7rc1 to avoid strange deadlock · Updated 2024-10-28 10:24:08 -06:00

61
80

7bc2c97bd9 · Check if allowed tokens is None (#2694) · Updated 2024-10-27 22:10:55 -06:00

39
3

0a655a0ab5 · v2.4.0 · Updated 2024-10-25 15:12:49 -06:00

43
1

e3db525917 · Fix integration mt0 (transformers update). · Updated 2024-10-24 03:54:11 -06:00

60
12

fe8d55dba9 · Clean both threads. · Updated 2024-10-21 06:49:07 -06:00

60
2

b3917ff695 · fix: add limit to internal stream function too · Updated 2024-10-15 09:14:04 -06:00

71
2

c9e0f36dbc · Machete WIP · Updated 2024-10-14 07:46:00 -06:00

82
1

99b1cf5948 · fix: rerun linter · Updated 2024-10-09 14:10:31 -06:00

83
6

130f9d16b5 · fix: rerun black lint · Updated 2024-10-09 12:44:41 -06:00

84
3

e618ce3ada · Fix: make `moe_kernels` imports conditional · Updated 2024-10-08 05:05:28 -06:00

86
1