.. |
merges
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
__init__.py
|
feat(server): Add native support for PEFT Lora models (#762)
|
2023-08-03 17:22:45 +02:00 |
adapter.py
|
feat: prefill chunking (#2600)
|
2024-10-16 12:49:33 +02:00 |
chunks.py
|
server: use chunked inputs
|
2024-06-07 08:09:04 +02:00 |
convert.py
|
Force weights_only (before fully breaking pickle files anyway). (#1710)
|
2024-04-05 19:23:57 +02:00 |
dist.py
|
feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248)
|
2024-07-20 19:02:04 +02:00 |
hub.py
|
Micro cleanup. (#2555)
|
2024-09-24 11:19:24 +02:00 |
import_utils.py
|
feat: enable pytorch xpu support for non-attention models (#2561)
|
2024-10-14 18:28:49 +02:00 |
log.py
|
feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248)
|
2024-07-20 19:02:04 +02:00 |
logits_process.py
|
Upgrading our deps. (#2750)
|
2024-11-15 14:03:27 +01:00 |
peft.py
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
prefill_chunking.py
|
feat: prefill chunking (#2600)
|
2024-10-16 12:49:33 +02:00 |
quantization.py
|
Add initial support for compressed-tensors checkpoints (#2732)
|
2024-11-10 13:54:07 +01:00 |
segments.py
|
feat: prefill chunking (#2600)
|
2024-10-16 12:49:33 +02:00 |
sgmv.py
|
fix: allocate tmp based on sgmv kernel if available (#2345)
|
2024-08-12 17:24:32 +02:00 |
speculate.py
|
chore: formatting
|
2023-12-11 14:49:52 +01:00 |
tokens.py
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
watermark.py
|
Fixing watermark. (#851)
|
2023-08-16 07:17:26 +02:00 |
weights.py
|
Add support for FP8 KV cache scales (#2628)
|
2024-10-24 16:36:18 +02:00 |