Commit Graph
Select branches
Hide Pull Requests
Wauplin-patch-1
add_batch_dimension
add_gptq_docs
add_integration_test
add_readme_dashboard
adding_docs
avoid-zero-seed
bnb4
bump-client-0.6.2
bump-poetry-and-requirements
chore/update_torch
ci-amihalik-update-chat-completion-messages
compat_logger
debug-request-id
deploy/aml
dev
enable_non_divisible_embeddings
explore-static-triton-kernels
extract-first-prompt-if-list
feat/attention_sinks
feat/better_tokens
feat/cuda_12
feat/flash_decoding
feat/improve_max_tokens
feat/parse_logs
feat/support_deepspeed
fix-gemma-tokenization
fix-grammar-fsm-batching
fix/avoid_record_streams
fix/use_get_speculate
fix_leak
fix_neox_rotary_emb
greedy-chat-tokens
improve-docs
improve_launcher_defaults
main
mamba2
medusa
megatron
model_compat_log
op-compilation-benchmarking
osanseviero-patch-1
quantization
remove_post_load_weights
revert
router-grammar-compile
self-generating-docs
simpler_exllama
speculative
streaming_conceptual
support-function-aware-tools
support-phi-model
test_docs
test_rocm
tmp_medusa
tmp_torch_compile
update_docs2
update_peft
update_readme
upgrade_axum
#1
#100
#101
#1010
#1018
#1019
#102
#1022
#1023
#1024
#103
#1033
#1034
#1042
#1044
#1045
#1048
#1049
#1052
#1054
#1058
#1059
#106
#1060
#1061
#1063
#1064
#1065
#1066
#1068
#107
#1070
#1071
#1075
#1076
#1077
#108
#1080
#1081
#1089
#109
#1090
#1091
#1092
#1094
#1096
#1097
#1099
#11
#110
#1100
#1101
#1102
#1103
#1105
#1110
#1112
#1116
#1123
#1128
#1134
#114
#1140
#1141
#115
#1153
#1155
#116
#1165
#1165
#117
#1173
#1176
#1178
#1179
#118
#1182
#1183
#1184
#1187
#119
#1198
#1202
#1211
#1214
#1219
#122
#1224
#1228
#123
#1239
#1241
#1242
#1243
#1246
#1252
#126
#1260
#1267
#1270
#1272
#1274
#1276
#1279
#128
#1285
#1287
#129
#1294
#1295
#13
#130
#1301
#1305
#1307
#1308
#1313
#132
#1326
#1328
#133
#1336
#1337
#134
#1341
#1343
#1346
#1347
#1348
#135
#1351
#1352
#1353
#1358
#136
#1361
#1364
#137
#1370
#1373
#138
#1381
#1386
#139
#1390
#1395
#14
#140
#1408
#141
#1414
#1419
#142
#1420
#1424
#1425
#1427
#1428
#143
#1436
#144
#1442
#1448
#145
#1450
#1453
#1454
#1455
#1459
#1461
#1462
#1463
#1469
#147
#1470
#1471
#1473
#1475
#1475
#1476
#1477
#1478
#148
#1480
#1484
#1486
#1488
#1489
#149
#1490
#1491
#1492
#1494
#1495
#1496
#1497
#1498
#15
#150
#1502
#1504
#1505
#1506
#151
#1511
#1512
#1514
#1515
#1516
#1517
#1518
#152
#1520
#1523
#1524
#1526
#1527
#153
#1532
#1533
#1537
#1539
#154
#1540
#1541
#1542
#1543
#155
#1550
#1552
#1555
#1555
#1556
#1557
#1560
#1563
#1564
#1567
#1568
#1569
#1570
#1571
#1576
#1577
#1578
#1579
#1580
#1583
#1584
#1585
#1586
#1587
#1588
#159
#1591
#1592
#1594
#16
#160
#1603
#1605
#1606
#1607
#1608
#1609
#161
#1610
#1614
#1617
#1618
#1619
#162
#1621
#1626
#1628
#163
#1632
#1637
#1638
#1639
#164
#1646
#1648
#1650
#1651
#1653
#1658
#1660
#1662
#1663
#1664
#1666
#1667
#1668
#167
#1676
#168
#1682
#1685
#1686
#1693
#1693
#1697
#1698
#17
#170
#1702
#1702
#1703
#1704
#1707
#1708
#1709
#1710
#1713
#1714
#1715
#1716
#1718
#1718
#1719
#1726
#1727
#1729
#173
#1730
#1730
#1731
#1734
#1735
#1736
#1737
#1737
#1739
#1739
#174
#1740
#1740
#1747
#1747
#1748
#1749
#1749
#175
#178
#179
#18
#180
#181
#183
#184
#185
#186
#187
#19
#190
#191
#193
#194
#194
#196
#2
#20
#200
#201
#202
#203
#205
#207
#208
#210
#210
#212
#213
#214
#215
#216
#217
#218
#219
#22
#220
#221
#222
#226
#227
#228
#23
#233
#234
#235
#237
#24
#242
#244
#246
#248
#25
#250
#251
#252
#255
#257
#258
#259
#26
#261
#262
#264
#266
#267
#269
#27
#272
#272
#274
#275
#276
#277
#278
#28
#282
#284
#285
#286
#287
#29
#292
#294
#297
#298
#299
#30
#302
#303
#304
#305
#308
#31
#310
#313
#317
#318
#32
#325
#327
#328
#329
#33
#334
#335
#336
#34
#340
#341
#343
#344
#348
#35
#351
#352
#353
#356
#357
#358
#359
#36
#360
#362
#363
#364
#367
#368
#37
#370
#373
#379
#384
#385
#388
#39
#393
#394
#395
#396
#4
#40
#400
#404
#406
#407
#41
#411
#412
#42
#434
#438
#44
#441
#443
#45
#453
#46
#462
#465
#47
#470
#472
#475
#477
#477
#48
#480
#483
#485
#488
#49
#498
#5
#50
#501
#502
#51
#513
#514
#516
#519
#52
#520
#521
#522
#525
#529
#53
#534
#54
#543
#544
#545
#55
#550
#553
#557
#558
#56
#561
#562
#567
#57
#575
#578
#579
#58
#580
#581
#582
#583
#585
#586
#587
#588
#59
#590
#595
#596
#6
#60
#600
#605
#605
#608
#609
#61
#611
#616
#617
#618
#619
#62
#621
#623
#624
#626
#63
#630
#633
#634
#635
#639
#64
#642
#643
#647
#648
#659
#66
#661
#664
#665
#666
#67
#670
#671
#678
#68
#684
#689
#698
#7
#70
#704
#708
#71
#712
#713
#715
#719
#72
#721
#723
#725
#727
#73
#733
#737
#738
#740
#741
#743
#745
#746
#748
#75
#750
#76
#761
#762
#767
#768
#770
#773
#783
#785
#789
#791
#793
#794
#795
#797
#798
#799
#8
#803
#805
#806
#809
#810
#812
#82
#820
#821
#822
#823
#829
#831
#836
#838
#84
#842
#848
#85
#851
#852
#853
#854
#858
#86
#860
#862
#867
#868
#87
#872
#88
#881
#884
#886
#889
#89
#892
#893
#898
#9
#90
#900
#901
#905
#906
#91
#910
#911
#918
#921
#93
#930
#932
#935
#94
#941
#947
#95
#950
#951
#953
#954
#957
#958
#96
#963
#964
#966
#968
#97
#971
#977
#981
#986
#989
#990
#993
#994
#999
v0.2.0
v0.2.1
v0.3.0
v0.3.1
v0.3.2
v0.4.0
v0.4.1
v0.4.2
v0.4.3
v0.5.0
v0.6.0
v0.7.0
v0.8.0
v0.8.1
v0.8.2
v0.9.0
v0.9.1
v0.9.2
v0.9.3
v0.9.4
v1.0.0
v1.0.1
v1.0.2
v1.0.3
v1.1.0
v1.1.1
v1.2.0
v1.3.0
v1.3.1
v1.3.2
v1.3.3
v1.3.4
v1.4.0
v1.4.1
v1.4.2
v1.4.3
v1.4.4
v1.4.5
v2.0.0
Select branches
Hide Pull Requests
Wauplin-patch-1
add_batch_dimension
add_gptq_docs
add_integration_test
add_readme_dashboard
adding_docs
avoid-zero-seed
bnb4
bump-client-0.6.2
bump-poetry-and-requirements
chore/update_torch
ci-amihalik-update-chat-completion-messages
compat_logger
debug-request-id
deploy/aml
dev
enable_non_divisible_embeddings
explore-static-triton-kernels
extract-first-prompt-if-list
feat/attention_sinks
feat/better_tokens
feat/cuda_12
feat/flash_decoding
feat/improve_max_tokens
feat/parse_logs
feat/support_deepspeed
fix-gemma-tokenization
fix-grammar-fsm-batching
fix/avoid_record_streams
fix/use_get_speculate
fix_leak
fix_neox_rotary_emb
greedy-chat-tokens
improve-docs
improve_launcher_defaults
main
mamba2
medusa
megatron
model_compat_log
op-compilation-benchmarking
osanseviero-patch-1
quantization
remove_post_load_weights
revert
router-grammar-compile
self-generating-docs
simpler_exllama
speculative
streaming_conceptual
support-function-aware-tools
support-phi-model
test_docs
test_rocm
tmp_medusa
tmp_torch_compile
update_docs2
update_peft
update_readme
upgrade_axum
#1
#100
#101
#1010
#1018
#1019
#102
#1022
#1023
#1024
#103
#1033
#1034
#1042
#1044
#1045
#1048
#1049
#1052
#1054
#1058
#1059
#106
#1060
#1061
#1063
#1064
#1065
#1066
#1068
#107
#1070
#1071
#1075
#1076
#1077
#108
#1080
#1081
#1089
#109
#1090
#1091
#1092
#1094
#1096
#1097
#1099
#11
#110
#1100
#1101
#1102
#1103
#1105
#1110
#1112
#1116
#1123
#1128
#1134
#114
#1140
#1141
#115
#1153
#1155
#116
#1165
#1165
#117
#1173
#1176
#1178
#1179
#118
#1182
#1183
#1184
#1187
#119
#1198
#1202
#1211
#1214
#1219
#122
#1224
#1228
#123
#1239
#1241
#1242
#1243
#1246
#1252
#126
#1260
#1267
#1270
#1272
#1274
#1276
#1279
#128
#1285
#1287
#129
#1294
#1295
#13
#130
#1301
#1305
#1307
#1308
#1313
#132
#1326
#1328
#133
#1336
#1337
#134
#1341
#1343
#1346
#1347
#1348
#135
#1351
#1352
#1353
#1358
#136
#1361
#1364
#137
#1370
#1373
#138
#1381
#1386
#139
#1390
#1395
#14
#140
#1408
#141
#1414
#1419
#142
#1420
#1424
#1425
#1427
#1428
#143
#1436
#144
#1442
#1448
#145
#1450
#1453
#1454
#1455
#1459
#1461
#1462
#1463
#1469
#147
#1470
#1471
#1473
#1475
#1475
#1476
#1477
#1478
#148
#1480
#1484
#1486
#1488
#1489
#149
#1490
#1491
#1492
#1494
#1495
#1496
#1497
#1498
#15
#150
#1502
#1504
#1505
#1506
#151
#1511
#1512
#1514
#1515
#1516
#1517
#1518
#152
#1520
#1523
#1524
#1526
#1527
#153
#1532
#1533
#1537
#1539
#154
#1540
#1541
#1542
#1543
#155
#1550
#1552
#1555
#1555
#1556
#1557
#1560
#1563
#1564
#1567
#1568
#1569
#1570
#1571
#1576
#1577
#1578
#1579
#1580
#1583
#1584
#1585
#1586
#1587
#1588
#159
#1591
#1592
#1594
#16
#160
#1603
#1605
#1606
#1607
#1608
#1609
#161
#1610
#1614
#1617
#1618
#1619
#162
#1621
#1626
#1628
#163
#1632
#1637
#1638
#1639
#164
#1646
#1648
#1650
#1651
#1653
#1658
#1660
#1662
#1663
#1664
#1666
#1667
#1668
#167
#1676
#168
#1682
#1685
#1686
#1693
#1693
#1697
#1698
#17
#170
#1702
#1702
#1703
#1704
#1707
#1708
#1709
#1710
#1713
#1714
#1715
#1716
#1718
#1718
#1719
#1726
#1727
#1729
#173
#1730
#1730
#1731
#1734
#1735
#1736
#1737
#1737
#1739
#1739
#174
#1740
#1740
#1747
#1747
#1748
#1749
#1749
#175
#178
#179
#18
#180
#181
#183
#184
#185
#186
#187
#19
#190
#191
#193
#194
#194
#196
#2
#20
#200
#201
#202
#203
#205
#207
#208
#210
#210
#212
#213
#214
#215
#216
#217
#218
#219
#22
#220
#221
#222
#226
#227
#228
#23
#233
#234
#235
#237
#24
#242
#244
#246
#248
#25
#250
#251
#252
#255
#257
#258
#259
#26
#261
#262
#264
#266
#267
#269
#27
#272
#272
#274
#275
#276
#277
#278
#28
#282
#284
#285
#286
#287
#29
#292
#294
#297
#298
#299
#30
#302
#303
#304
#305
#308
#31
#310
#313
#317
#318
#32
#325
#327
#328
#329
#33
#334
#335
#336
#34
#340
#341
#343
#344
#348
#35
#351
#352
#353
#356
#357
#358
#359
#36
#360
#362
#363
#364
#367
#368
#37
#370
#373
#379
#384
#385
#388
#39
#393
#394
#395
#396
#4
#40
#400
#404
#406
#407
#41
#411
#412
#42
#434
#438
#44
#441
#443
#45
#453
#46
#462
#465
#47
#470
#472
#475
#477
#477
#48
#480
#483
#485
#488
#49
#498
#5
#50
#501
#502
#51
#513
#514
#516
#519
#52
#520
#521
#522
#525
#529
#53
#534
#54
#543
#544
#545
#55
#550
#553
#557
#558
#56
#561
#562
#567
#57
#575
#578
#579
#58
#580
#581
#582
#583
#585
#586
#587
#588
#59
#590
#595
#596
#6
#60
#600
#605
#605
#608
#609
#61
#611
#616
#617
#618
#619
#62
#621
#623
#624
#626
#63
#630
#633
#634
#635
#639
#64
#642
#643
#647
#648
#659
#66
#661
#664
#665
#666
#67
#670
#671
#678
#68
#684
#689
#698
#7
#70
#704
#708
#71
#712
#713
#715
#719
#72
#721
#723
#725
#727
#73
#733
#737
#738
#740
#741
#743
#745
#746
#748
#75
#750
#76
#761
#762
#767
#768
#770
#773
#783
#785
#789
#791
#793
#794
#795
#797
#798
#799
#8
#803
#805
#806
#809
#810
#812
#82
#820
#821
#822
#823
#829
#831
#836
#838
#84
#842
#848
#85
#851
#852
#853
#854
#858
#86
#860
#862
#867
#868
#87
#872
#88
#881
#884
#886
#889
#89
#892
#893
#898
#9
#90
#900
#901
#905
#906
#91
#910
#911
#918
#921
#93
#930
#932
#935
#94
#941
#947
#95
#950
#951
#953
#954
#957
#958
#96
#963
#964
#966
#968
#97
#971
#977
#981
#986
#989
#990
#993
#994
#999
v0.2.0
v0.2.1
v0.3.0
v0.3.1
v0.3.2
v0.4.0
v0.4.1
v0.4.2
v0.4.3
v0.5.0
v0.6.0
v0.7.0
v0.8.0
v0.8.1
v0.8.2
v0.9.0
v0.9.1
v0.9.2
v0.9.3
v0.9.4
v1.0.0
v1.0.1
v1.0.2
v1.0.3
v1.1.0
v1.1.1
v1.2.0
v1.3.0
v1.3.1
v1.3.2
v1.3.3
v1.3.4
v1.4.0
v1.4.1
v1.4.2
v1.4.3
v1.4.4
v1.4.5
v2.0.0
-
feb7806ca4
fix(readme): Typo
OlivierDehaene
2022-11-14 16:22:10 +0100 -
91f5f86280
fix(router): Fix HTTP status codes
OlivierDehaene
2022-11-14 14:34:15 +0100 -
6c781025ae
feat(rust): Update to 1.65
OlivierDehaene
2022-11-14 13:59:56 +0100 -
dccd5c2b1a
feat(server): Clarify CausalLMBatch concatenate method
OlivierDehaene
2022-11-09 18:24:07 +0100 -
fa43fb71be
fix(server): Fix Transformers fork version
OlivierDehaene
2022-11-08 17:42:38 +0100 -
4236e41b0d
feat(server): Improved doc
OlivierDehaene
2022-11-07 12:53:56 +0100 -
cea6051eff
feat(launcher): Pass CUDA_VISIBLE_DEVICES to the shard
OlivierDehaene
2022-11-04 18:31:08 +0100 -
427d7cc444
feat(server): Support AutoModelForSeq2SeqLM
OlivierDehaene
2022-11-04 18:03:04 +0100 -
c5665f5c8b
feat(server): Support generic AutoModelForCausalLM
OlivierDehaene
2022-11-04 14:22:47 +0100 -
755fc0e403
fix(models): Revert buggy support for AutoModel
OlivierDehaene
2022-11-03 16:07:54 +0100 -
b3b7ea0d74
feat: Use json formatter by default in docker image
OlivierDehaene
2022-11-02 17:29:56 +0100 -
3cf6368c77
feat(server): Support all AutoModelForCausalLM on a best effort basis
OlivierDehaene
2022-10-28 19:24:00 +0200 -
09674e6df9
feat(server): Support bitsandbytes
OlivierDehaene
2022-10-27 14:25:29 +0200 -
beb552127a
feat(client): Simplify sharded logic
OlivierDehaene
2022-10-22 23:40:05 +0200 -
c8ce9b2515
2022-10-22 20:00:15 +0200 -
75adbb3441
feat(weights): Support safetensors
#1
OlivierDehaene
2022-10-22 19:46:05 +0200 -
be8827fe41
2022-10-22 10:44:52 +0200 -
3398211873
2022-10-21 23:15:02 +0200 -
604b18bec2
2022-10-21 20:47:57 +0200 -
457c9038ff
2022-10-21 18:02:04 +0200 -
c837893370
feat(router): Add max_waiting_tokens
OlivierDehaene
2022-10-21 16:40:05 +0200 -
895a341d06
fix(validation): Fix error messages
OlivierDehaene
2022-10-21 10:59:15 +0200 -
f16f2f5ae1
v0.1.0
Olivier Dehaene
2022-10-18 15:19:03 +0200 -
92c1ecd008
feat: Add arguments to CLI
Olivier Dehaene
2022-10-17 18:27:33 +0200 -
5e5d8766a2
feat: Improve error handling
Olivier Dehaene
2022-10-17 14:59:00 +0200 -
00e6ce44b1
Update aml deployment
Olivier Dehaene
2022-10-17 10:39:59 +0200 -
bcb53903b8
feat: Add AML deployment
Olivier Dehaene
2022-10-15 20:21:50 +0200 -
bf99afe916
feat: Docker image
Olivier Dehaene
2022-10-14 15:56:21 +0200 -
f11965c11d
support deepspeed
feat/support_deepspeed
Olivier Dehaene
2022-10-13 11:05:44 +0200 -
39df4d9975
Use axum
Olivier Dehaene
2022-10-11 18:14:39 +0200 -
e86ecbac63
ValidationError was not correctly handled
Olivier Dehaene
2022-10-11 16:53:40 +0200 -
4c693e6524
Refactored gRPC interface Added validation logic
Olivier Dehaene
2022-10-11 16:50:54 +0200 -
fa9a088467
Add load testing
Olivier Dehaene
2022-10-11 10:36:51 +0200 -
1d986983d5
fix: cleanup
Olivier Dehaene
2022-10-08 12:34:25 +0200 -
295831a481
Init
Olivier Dehaene
2022-10-08 12:30:12 +0200