awq/quantize
|
feat: format code (#1070)
|
2023-09-27 12:22:09 +02:00 |
gptq
|
GPTQ support on ROCm (#1489)
|
2024-01-26 16:27:44 +01:00 |
convert.py
|
fit for baichuan models (#981)
|
2023-09-08 16:51:34 +02:00 |
dist.py
|
feat: add cuda memory fraction (#659)
|
2023-07-24 11:43:58 +02:00 |
hub.py
|
Fix local load for peft (#1373)
|
2023-12-21 17:29:23 +01:00 |
import_utils.py
|
Add RoCm support (#1243)
|
2023-11-27 14:08:12 +01:00 |
layers.py
|
v1.4.0 (#1494)
|
2024-01-26 19:04:57 +01:00 |
log.py
|
v1.3.4
|
2023-12-22 15:46:04 +01:00 |
medusa.py
|
chore: formatting
|
2023-12-11 14:49:52 +01:00 |
paged_attention.py
|
chore: formatting
|
2023-12-11 14:49:52 +01:00 |
peft.py
|
fix: fix local loading for .bin models (#1419)
|
2024-01-09 15:21:00 +01:00 |
speculate.py
|
chore: formatting
|
2023-12-11 14:49:52 +01:00 |
tokens.py
|
feat(server): add frequency penalty (#1541)
|
2024-02-08 18:41:25 +01:00 |
watermark.py
|
Fixing watermark. (#851)
|
2023-08-16 07:17:26 +02:00 |
weights.py
|
Fixing non divisible embeddings. (#1476)
|
2024-01-24 13:08:41 +01:00 |