Default Branch

23bc38b10d · fix: include add_special_tokens in kserve request (#2859) · Updated 2024-12-19 14:55:17 -07:00

Branches

462dcfea85 · misc(backend): attempt to run the tests? · Updated 2024-12-21 04:17:32 -07:00

29
89

575d97339c · fix: create new idefic3 file, simplify logic and adjust llama weight loading · Updated 2024-12-20 17:27:29 -07:00

3
11

08b39d1ae2 · Fix `docker run` in `README.md` · Updated 2024-12-20 03:20:51 -07:00

0
1

fa14d71ac8 · add fp8 kv cache for rocm · Updated 2024-12-18 07:55:53 -07:00

2
1

8936a0379d · Merge branch 'main' into flash_decoding_rocm · Updated 2024-12-18 05:44:51 -07:00

2
7

f8771d0a83 · nm changes · Updated 2024-12-18 05:15:28 -07:00

2
8

93a2413ba6 · fix: adjust trtllm looper for video chunk enum · Updated 2024-12-17 18:41:27 -07:00

3
51

f89bdb72c8 · Fix runtime error when Qwen2-VL was prompted with multiple images · Updated 2024-12-16 14:15:43 -07:00

4
2

0ac61165fe · misc(llamacpp): fix typo · Updated 2024-12-13 09:13:29 -07:00

7
2

1fa9ca2f16 · add fix · Updated 2024-12-13 09:10:00 -07:00

8
1

182ffaf064 · misc: use return Ok(()) · Updated 2024-12-12 08:04:05 -07:00

72
92

1ca37d3353 · misc(ci): let's use the correct way to invoke sccache · Updated 2024-12-11 14:18:54 -07:00

15
15

bb9095aae3 · Updating lock. · Updated 2024-12-11 13:12:49 -07:00

13
2

b653605e54 · feat(trtllm): fix logits retrieval · Updated 2024-12-10 15:28:13 -07:00

29
30

a3049f102e · fix: address image resize and rebase changes · Updated 2024-12-09 14:26:14 -07:00

17
3

8f326c9791 · Fixing lockfile. · Updated 2024-12-09 13:20:59 -07:00

18
2

600d7e6ece · Update server/text_generation_server/adapters/lora.py · Updated 2024-12-01 22:02:02 -07:00

36
2

63b8c59d9f · Add `poetry-plugin-export` and fix indentation · Updated 2024-11-29 06:16:01 -07:00

35
5

d2ed52f531 · v2.4.1 · Updated 2024-11-22 10:28:39 -07:00

41
1

53b6f6e604 · Apply suggestions from code review · Updated 2024-11-18 04:28:07 -07:00

63
8