Merve Noyan
e9ae678699
Quantization docs ( #911 )
...
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-09-12 15:52:46 +02:00
Merve Noyan
1f69fb9ed4
Tensor Parallelism conceptual guide ( #886 )
...
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-09-12 12:11:20 +02:00
Merve Noyan
30a93a0dec
Paged Attention Conceptual Guide ( #901 )
2023-09-08 14:18:42 +02:00
Merve Noyan
af1ed38f39
Safetensors conceptual guide ( #905 )
...
IDK what else to add in this guide, I looked for relevant code in TGI
codebase and saw that it's used in quantization as well (maybe I could
add that?)
2023-09-07 16:22:06 +02:00
Omar Sanseviero
a9fdfb2464
docs: Remove redundant content from stream guide ( #884 )
...
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
2023-09-06 18:42:42 +02:00
Merve Noyan
f260eb72f9
docs: Flash Attention Conceptual Guide ( #892 )
...
PR for conceptual guide on flash attention. I will add more info unless
I'm told otherwise.
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2023-09-06 15:36:49 +02:00
Julien Bouquillon
3ed4c0f33f
docs: typo in streaming.js ( #971 )
...
Looks like an error
2023-09-06 14:57:59 +02:00
Omar Sanseviero
bfa070611d
Add streaming guide ( #858 )
...
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
2023-08-18 13:27:08 +02:00