Commit Graph

6 Commits

Author SHA1 Message Date
Daniël de Kok ce85efa968
Move to moe-kernels package and switch to common MoE layer (#2511)
* Move to moe-kernels package and switch to common MoE layer

This change introduces the new `moe-kernels` package:

- Add `moe-kernels` as a dependency.
- Introduce a `SparseMoELayer` module that can be used by MoE
  models.
- Port over Mixtral and Deepseek.

* Make `cargo check` pass

* Update runner
2024-09-17 18:08:58 +02:00
Daniël de Kok de2cdeca53
nix: add punica-kernels (#2477)
Enables LoRA support.
2024-09-02 11:31:36 +02:00
Daniël de Kok 4e821c003a
nix: build Torch against MKL and various other improvements (#2469)
Updates tgi-nix input:

- Move Torch closer to upstream by building against MKL.
- Remove compute capability 8.7 from Torch (Jetson).
- Sync nixpkgs cumpute capabilities with Torch (avoids
  compiling too mana capabilities for MAGMA).
- Use nixpkgs configuration passed through by `tgi-nix`.
2024-08-29 16:25:25 +02:00
Daniël de Kok 358ceb67dd
nix: add awq-inference-engine as server dependency (#2442) 2024-08-21 22:20:03 +02:00
Nicolas Patry 310778e02a
Adding eetq to flake. (#2438) 2024-08-21 09:06:33 +02:00
Daniël de Kok f5f11b797e
nix: add pure server to flake, add both pure and impure devshells (#2430)
* nix: pure server and support both pure and impure devShells

* nix: remove unused poetry2nix input

It is not wired up and we now have a pure server.

* nix: add ipdb to impure devshell
2024-08-20 22:07:33 +02:00