Default Branch

6cb41a80a1 · Revert "Remove AWS credentials?" · Updated 2025-01-24 06:34:17 -07:00

Branches

cad4644537 · backend(trtllm): export env variable in run mb? · Updated 2025-01-25 00:00:07 -07:00

0
7

4c8bf7f5b8 · fix: add telemetry regular pings and fix unhandled errors avoid not sending telemetry stop events. · Updated 2025-01-24 10:10:12 -07:00

0
1

bafbd06744 · Update transformers_flash_causal_lm.py · Updated 2025-01-24 07:06:50 -07:00

0
2

b70f29d729 · Bypasse perm issue. · Updated 2025-01-24 04:12:47 -07:00

2
2

9157833662 · Update to attention-kernels 0.2.0 · Updated 2025-01-24 04:08:49 -07:00

2
1

852f83de6f · fix: enable all cuda graphs and bump snapshots · Updated 2025-01-23 08:32:46 -07:00

13
7

6d335ca7ce · Remove modifications in Lock. · Updated 2025-01-22 05:37:17 -07:00

12
2

cfd22726c9 · backend(vllm): initial commit · Updated 2025-01-21 15:37:56 -07:00

13
1

16162602c2 · Add fp8 support moe models · Updated 2025-01-20 06:55:54 -07:00

18
1

17192c9a0e · fix: remove test debug params · Updated 2025-01-17 09:19:02 -07:00

47
54

b4187d6022 · Add tgi_batch_current_size and tgi_batch_current_size as response header · Updated 2025-01-17 07:48:02 -07:00

22
1

bde5f9ad82 · nix: update to PyTorch 2.5.1 · Updated 2025-01-16 23:44:21 -07:00

26
1

48067e4a0d · fmt · Updated 2025-01-13 18:23:28 -07:00

40
3

c7b2e3f100 · chore: Enable blocking feature for reqwest · Updated 2025-01-09 03:07:49 -07:00

47
2

db6a9e1232 · add ats support · Updated 2025-01-07 17:23:16 -07:00

47
2

f89bdb72c8 · Fix runtime error when Qwen2-VL was prompted with multiple images · Updated 2024-12-16 14:15:43 -07:00

51
2

1fa9ca2f16 · add fix · Updated 2024-12-13 09:10:00 -07:00

55
1

182ffaf064 · misc: use return Ok(()) · Updated 2024-12-12 08:04:05 -07:00

119
92

1ca37d3353 · misc(ci): let's use the correct way to invoke sccache · Updated 2024-12-11 14:18:54 -07:00

62
15

bb9095aae3 · Updating lock. · Updated 2024-12-11 13:12:49 -07:00

60
2