gptq
|
fix(server): fix exllama buffers (#689)
|
2023-07-24 14:25:43 +02:00 |
__init__.py
|
feat(server): Rework model loading (#344)
|
2023-06-08 14:51:52 +02:00 |
convert.py
|
fix(server): blacklist local files (#609)
|
2023-07-13 21:54:55 +02:00 |
dist.py
|
feat: add cuda memory fraction (#659)
|
2023-07-24 11:43:58 +02:00 |
flash_attn.py
|
feat(server): flash attention v2 (#624)
|
2023-07-18 16:21:18 +02:00 |
layers.py
|
feat: add cuda memory fraction (#659)
|
2023-07-24 11:43:58 +02:00 |
weights.py
|
feat: add cuda memory fraction (#659)
|
2023-07-24 11:43:58 +02:00 |