Morgan Funtowicz
|
31d9254776
|
feat(backend): remove static from inner_fw visitor as it leads to invalid memory locations
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
7b0a56f40f
|
feat(backend): fix memory leaking on llama_sampler when the decode ends
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
86a2ae6ba2
|
chore: unsued variables
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
2cdfed94d9
|
feat(backend): correctly link to shared fmt and spdlog instead of static
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
bd8f0f15e1
|
feat(backend): fix invalid reference to ctx instead of context in release build
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
3e82f14f57
|
feat(backend): somewhat generates the final infer response
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
b50dcddbb8
|
feat(backend): avoid dropping the boxed stream at the end of the callback
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
612f2f939f
|
feat(backend): bind incoming request to the server
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
d4aee42fd8
|
feat(backend): add logit parameter in the callback fn
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
f39edc72ff
|
feat(backend): add mapping for ignore_eos_token stopping criteria
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
3af2c6837c
|
misc(offline): match rework
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
d52b4c4978
|
feat(backend): full rework of the backend internal to safer c++
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
6a5f6b0755
|
misc(offline): update offline tester
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
b98c635781
|
feat(backend): entirely rewrite backend
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
611590440d
|
misc(offline): expose more parameters for generate
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
dbc5b7a0f7
|
misc(offline): link correctly
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
0c1dd0ed2b
|
feat(llamacpp): wip explosion
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
a316c53255
|
feat(llamacpp): expose number of threads for the backend when constructing the model
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
179309b364
|
misc(build): refactor build type detection in cmake
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
f0859c247f
|
misc(build): handle different lib destination folder lib/lib64
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
e4d803c94e
|
feat(backend): build and link through build.rs
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
355d8a55b4
|
feat(backend): wip Rust binding
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
f9c248657d
|
chore(backend): minor formatting
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
37faeb34b2
|
feat(backend): expose frequency and repetition penalties
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
d4b5be10f9
|
feat(backend): minor refactor
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
92bb113653
|
feat(backend): use llama_token as TokenId type
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
45d5a6a8c5
|
feat(backend): add some initial decoding steps
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
098c66920d
|
feat(backend): tell cmake to build llama-common and link to it
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
0911076320
|
feat(backend): correctly load llama.cpp model from llama api and not gpt2
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
05ad684676
|
feat(llamacpp): enable cuda
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
fa89d1e613
|
misc(cmake): wut
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
e4432d36b1
|
misc(cmake): add parameter to build specific cuda arch
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
52d57dca79
|
feat(llamacpp): initial end2end build
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
7d1f8a2bd6
|
feat(llamacpp): correctly handle CMAKE_BUILD_TYPE for spdlog macros
|
2024-11-14 08:42:01 +01:00 |
Morgan Funtowicz
|
aa1fcba59f
|
feat(llamacpp): initial commit
# Conflicts:
# Cargo.lock
|
2024-11-14 08:42:01 +01:00 |