.. |
merges
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
__init__.py
|
feat(server): Add native support for PEFT Lora models (#762)
|
2023-08-03 17:22:45 +02:00 |
adapter.py
|
fix: refactor adapter weight loading and mapping (#2193)
|
2024-07-24 15:32:14 -04:00 |
chunks.py
|
server: use chunked inputs
|
2024-06-07 08:09:04 +02:00 |
convert.py
|
Force weights_only (before fully breaking pickle files anyway). (#1710)
|
2024-04-05 19:23:57 +02:00 |
dist.py
|
feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248)
|
2024-07-20 19:02:04 +02:00 |
hub.py
|
Enable multiple LoRa adapters (#2010)
|
2024-06-25 14:46:27 -04:00 |
import_utils.py
|
Pr 2337 ci branch (#2379)
|
2024-08-08 12:30:29 -04:00 |
log.py
|
feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248)
|
2024-07-20 19:02:04 +02:00 |
logits_process.py
|
patch-error-on-invalid-grammar (#2282)
|
2024-07-29 10:09:25 -04:00 |
peft.py
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
quantization.py
|
Handle GPTQ-Marlin loading in `GPTQMarlinWeightLoader` (#2300)
|
2024-07-31 13:08:41 +02:00 |
segments.py
|
Enable multiple LoRa adapters (#2010)
|
2024-06-25 14:46:27 -04:00 |
sgmv.py
|
fix: allocate tmp based on sgmv kernel if available (#2345)
|
2024-08-12 17:24:32 +02:00 |
speculate.py
|
chore: formatting
|
2023-12-11 14:49:52 +01:00 |
tokens.py
|
feat: add ruff and resolve issue (#2262)
|
2024-07-26 10:29:09 -04:00 |
watermark.py
|
Fixing watermark. (#851)
|
2023-08-16 07:17:26 +02:00 |
weights.py
|
fix(server): fix fp8 weight loading (#2268)
|
2024-07-22 15:51:32 +00:00 |