Default Branch

6ee8d6dd3b · fix: set outlines version to 0.1.3 to avoid caching serialization issue (#2766) · Updated 2024-11-20 16:09:39 -07:00

Branches

5335bf973b · feat(backend): multistream inference on CPU · Updated 2024-11-20 16:03:05 -07:00

22
65

74a8a820ad · Use FP8 KV cache when specified by compressed-tensors · Updated 2024-11-20 07:25:50 -07:00

4
1

fa577c9be2 · fix: remove continue_final_message chat request param · Updated 2024-11-19 14:24:18 -07:00

23
4

d2581ed606 · feat(backend): remove all the logs from hardware.hpp · Updated 2024-11-18 16:19:22 -07:00

14
5

0fd2ab3e89 · fix: remove unused deps and imports · Updated 2024-11-18 14:48:09 -07:00

9
12

53b6f6e604 · Apply suggestions from code review · Updated 2024-11-18 04:28:07 -07:00

13
8

627862c11a · Updated the flops calculation (checked with fvcore). · Updated 2024-11-11 06:31:32 -07:00

28
10

6297f1769f · feat: add payload limit · Updated 2024-11-05 08:38:21 -07:00

24
1

a604bfe450 · fix: run pre commit lints · Updated 2024-11-01 10:11:57 -06:00

32
2

3bb78a8266 · misc(deps): update ompi from 4.1.6 to 4.1.7rc1 to avoid strange deadlock · Updated 2024-10-28 10:24:08 -06:00

60
80

7bc2c97bd9 · Check if allowed tokens is None (#2694) · Updated 2024-10-27 22:10:55 -06:00

38
3

0a655a0ab5 · v2.4.0 · Updated 2024-10-25 15:12:49 -06:00

42
1

e3db525917 · Fix integration mt0 (transformers update). · Updated 2024-10-24 03:54:11 -06:00

59
12

fe8d55dba9 · Clean both threads. · Updated 2024-10-21 06:49:07 -06:00

59
2

b3917ff695 · fix: add limit to internal stream function too · Updated 2024-10-15 09:14:04 -06:00

70
2

c9e0f36dbc · Machete WIP · Updated 2024-10-14 07:46:00 -06:00

81
1

99b1cf5948 · fix: rerun linter · Updated 2024-10-09 14:10:31 -06:00

82
6

130f9d16b5 · fix: rerun black lint · Updated 2024-10-09 12:44:41 -06:00

83
3

e618ce3ada · Fix: make `moe_kernels` imports conditional · Updated 2024-10-08 05:05:28 -06:00

85
1

74489227e0 · Add Google Cloud in `docs/source/references/api_reference.md` · Updated 2024-10-05 08:54:17 -06:00

89
2