fxmarty
dadfff621e
update
2024-06-11 11:25:14 +00:00
fxmarty
73b067d193
skip exl2 tests on rocm
2024-06-11 09:29:08 +00:00
fxmarty
b452620c04
fix gptq tests, LLMM1 matrix bound
2024-06-11 07:27:14 +00:00
fxmarty
d3c7f63416
Merge branch 'main' into amd-ci-fx
2024-06-10 15:10:04 +02:00
fxmarty
de6f2cd08d
disable marlin tests on rocm/xpu
2024-06-10 13:06:11 +00:00
Daniël de Kok
85dfc39222
Add Phi-3 medium support ( #2039 )
...
Add support for Phi-3-medium
The main difference between the medium and mini models is that medium
uses grouped query attention with a packed QKV matrix. This change adds
support for GQA with packed matrixes to `Weights.get_weights_col_packed`
and uses it for Phi-3. This also allows us to remove the custom
implementation of GQA from dbrx attention loading.
2024-06-10 09:22:29 +02:00
fxmarty
9b3674d903
ROCm and sliding windows fixes ( #2033 )
...
* update vllm commit & fix models using sliding window
* update
* update commit
* fix bug where tunableop is bound to cuda graph even when cuda graph are disabled
* enable tunableop by default
* fix sliding window
* address review
* dead code
* precise comment
* is it flaky?
2024-06-10 15:09:50 +08:00
Nicolas Patry
41699e9bbf
.
2024-06-08 22:16:37 +02:00
Nicolas Patry
eec6c3241b
.
2024-06-08 21:55:27 +02:00
Nicolas Patry
0ced5fac2d
Fix.
2024-06-08 08:58:05 +02:00
Nicolas Patry
452d442ef2
We need tailscale.
2024-06-08 08:46:55 +02:00
Nicolas Patry
e62c51d140
Here we go again.
2024-06-08 08:41:40 +02:00
Nicolas Patry
8be9c197e5
Is this it ?
2024-06-08 07:54:00 +02:00
Nicolas Patry
d9f704a1b3
Are we done ?
2024-06-08 07:53:21 +02:00
Nicolas Patry
909e6569d1
.
2024-06-08 07:40:08 +02:00
Nicolas Patry
fa3e811672
No fromJSON.
2024-06-07 23:22:48 +02:00
Nicolas Patry
98d383062a
Extra spaces?
2024-06-07 23:15:58 +02:00
Nicolas Patry
66e59831f2
.
2024-06-07 23:00:27 +02:00
Nicolas Patry
741ab87fba
fromJSON
2024-06-07 22:58:28 +02:00
Nicolas Patry
fc4404d9d2
.
2024-06-07 22:45:57 +02:00
Nicolas Patry
65b2efc585
.
2024-06-07 22:38:06 +02:00
Nicolas Patry
eda299b84f
.
2024-06-07 20:18:57 +02:00
Nicolas Patry
e79c83d7ba
Attempt #727 .
2024-06-07 20:11:17 +02:00
Nicolas Patry
c6fa9547a2
Test.
2024-06-07 19:58:56 +02:00
Nicolas Patry
a045ead6eb
.
2024-06-07 19:52:14 +02:00
Nicolas Patry
5e769ce1e0
?
2024-06-07 19:46:34 +02:00
Nicolas Patry
87df3d5603
?
2024-06-07 17:12:17 +02:00
Nicolas Patry
19f6327bd2
esac. Great idea dev of the past.
2024-06-07 16:14:24 +02:00
Nicolas Patry
2a314fa0dd
Bash in bash.
2024-06-07 16:09:38 +02:00
Nicolas Patry
b10ba9205c
...
2024-06-07 16:05:11 +02:00
Nicolas Patry
1f4248944c
Come on GH, dash, underscore, who cares at this point.
2024-06-07 16:03:05 +02:00
Nicolas Patry
cc7c2fd90e
runs on.
2024-06-07 16:01:59 +02:00
Nicolas Patry
1e759f9da6
Wat?
2024-06-07 16:00:40 +02:00
Nicolas Patry
078fb55109
Abbé Faria?
2024-06-07 15:58:23 +02:00
Nicolas Patry
8205962950
Ahah, I see an exit.
2024-06-07 15:56:52 +02:00
Nicolas Patry
043de74dcd
**Feigns death**
2024-06-07 15:52:35 +02:00
Nicolas Patry
81ddb9d173
Please let me out !
2024-06-07 15:49:31 +02:00
Nicolas Patry
aea77a8ab3
Banana.
2024-06-07 15:44:51 +02:00
Nicolas Patry
e6a4dbe7f5
I'm an certainly not a monkey.
2024-06-07 15:43:58 +02:00
Nicolas Patry
a759e2e7c5
Not hitting myself against the wall.
2024-06-07 15:39:37 +02:00
Nicolas Patry
8712a367dc
Flying blind feels nice.
2024-06-07 15:36:13 +02:00
Nicolas Patry
6f3117512c
Give us sanitation tools already.
2024-06-07 15:25:43 +02:00
Nicolas Patry
54e3340663
gh..
2024-06-07 15:09:27 +02:00
Nicolas Patry
11c75f3a14
I hate this.
2024-06-07 15:07:51 +02:00
Nicolas Patry
3a8e9c221e
Rename for everyone.
2024-06-07 15:03:01 +02:00
Nicolas Patry
f29371e587
Naming.
2024-06-07 14:49:48 +02:00
Nicolas Patry
3ee92eb614
?
2024-06-07 14:15:45 +02:00
Nicolas Patry
3684439a0e
Trying new split of tasks.
2024-06-07 12:03:22 +02:00
Nicolas Patry
9101b2ae4f
Fix.
2024-06-07 10:05:51 +02:00
Nicolas Patry
c73355b99c
Merge branch 'main' into ci_amd2
2024-06-07 10:04:59 +02:00