Commit Graph

871 Commits

Author SHA1 Message Date
Morgan Funtowicz e4fc0ebcbe update TensorRT install script to latest 2024-07-23 22:23:30 +00:00
Morgan Funtowicz 03935f6705 update TensorRT-LLM to latest version 2024-07-23 22:13:02 +00:00
Morgan Funtowicz ef1876346c refactor the compute capabilities detection along with num gpus 2024-07-23 22:12:42 +00:00
Morgan Funtowicz 3c39ab5ac8 fix typo 2024-07-23 08:11:36 +00:00
Morgan Funtowicz 4c657ca158 make docker linter happy with same capitalization rule 2024-07-23 07:42:31 +00:00
Morgan Funtowicz d9decb4c2c move to TensorRT-LLM v0.11.0 2024-07-23 07:35:00 +00:00
Morgan Funtowicz ff151b738b refactored docker image 2024-07-23 07:34:40 +00:00
Morgan Funtowicz 3db1be412c commenting out Python part for TensorRT installation 2024-07-23 07:27:34 +00:00
Morgan Funtowicz 805e584b92 update tgi entrypoint 2024-07-22 19:13:01 +00:00
Morgan Funtowicz d0a34a95f2 adding missing ld_library_path for cuda stubs in Dockerfile 2024-07-22 15:16:39 +00:00
Morgan Funtowicz 3fd2bb70c3 fix missing / before tgi lib path 2024-07-22 14:57:03 +00:00
Morgan Funtowicz a32ef3b875 correctly setup linking search path for runtime layer 2024-07-22 14:42:43 +00:00
Morgan Funtowicz fd06ca6e7e add missing pkgconfig folder for MPI in Dockerfile 2024-07-22 14:20:06 +00:00
Morgan Funtowicz 40330c73f0 align all the linker search dependency 2024-07-22 14:14:57 +00:00
Morgan Funtowicz 6a9e925ec1 fix bad copy/past missing nvinfer linkage direction 2024-07-22 11:43:10 +00:00
Morgan Funtowicz 3597beefe2 leverage pkg-config to probe libraries paths and reuse new install structure from cmake 2024-07-22 11:39:11 +00:00
Morgan Funtowicz 2aac2ff2cd do the same name definition stuff for tensorrt_llm_executor_static 2024-07-22 11:32:54 +00:00
Morgan Funtowicz da079df4cd simplify prebuilt trtllm libraries name definition 2024-07-22 11:32:31 +00:00
Morgan Funtowicz 20bcaea54f add some more information in CMakeLists.txt to correctly find and install nvrtc wrapper 2024-07-22 09:33:38 +00:00
Morgan Funtowicz 84153702d2 add some more information in CMakeLists.txt to correctly install executorWorker 2024-07-22 08:43:10 +00:00
Morgan Funtowicz d5464d2f80 add initial Dockerfile for TRTLLM backend 2024-07-19 22:08:12 +00:00
Morgan Funtowicz 6300bab8b4 make sure executor_worker is provided 2024-07-19 11:57:10 +00:00
Morgan Funtowicz 97723d1458 add logging in case of decoding error 2024-07-18 22:19:25 +00:00
Morgan Funtowicz 9ea7f9e950 remove logging 2024-07-18 22:08:46 +00:00
Morgan Funtowicz e82dc30e8a expose information about potential error happening while decoding 2024-07-18 22:07:59 +00:00
Morgan Funtowicz a19d318947 define a shared struct to hold the result of a decoding step 2024-07-18 21:33:04 +00:00
Morgan Funtowicz a036574a86 add some more validation about grammar not supported 2024-07-18 20:57:29 +00:00
Morgan Funtowicz b643a436f3 forward tgi parameters rep/freq penalty 2024-07-18 20:56:58 +00:00
Morgan Funtowicz 95847c6587 expose the internal missing start/queue timestamp 2024-07-18 15:57:33 +00:00
Morgan Funtowicz fd021e5461 refactor Stream impl for Generation to factorise code 2024-07-18 14:21:43 +00:00
Morgan Funtowicz b56c43ec30 remove unneeded scope variable for now 2024-07-18 12:57:10 +00:00
Morgan Funtowicz 0212b1774a correctly forward back the log probabilities 2024-07-17 22:33:10 +00:00
Morgan Funtowicz bcb96feea6 update invalid doc in cpp file 2024-07-17 22:23:22 +00:00
Morgan Funtowicz 69674a3a2d add all the necessary plumbery to return the generated content 2024-07-17 22:12:49 +00:00
Morgan Funtowicz ce715c76f8 remove unnecessary log 2024-07-17 22:09:50 +00:00
Morgan Funtowicz e983ee5bb8 make sure the context is not dropped in the middle of the async decoding. 2024-07-17 21:56:50 +00:00
Morgan Funtowicz 9220340ff7 compute the number of maximum new tokens for each request independently 2024-07-17 13:55:29 +00:00
Morgan Funtowicz a01cd030d4 oops missing c++ backend definitions 2024-07-16 20:11:59 +00:00
Morgan Funtowicz 7784a21d48 impl RwLock scenario for TensorRtLllmBackend 2024-07-16 20:08:10 +00:00
Morgan Funtowicz 31d9f4d5dc expose shutdown function at ffi layer 2024-07-15 07:36:01 +00:00
Morgan Funtowicz b291be64a0 impl the rust backend which currently cannot move the actual computation in background thread 2024-07-12 19:26:32 +00:00
Morgan Funtowicz 518d9a9e0b make sure to track include/ffi.h to trigger rebuild from cargo 2024-07-12 19:26:04 +00:00
Morgan Funtowicz 344f33f398 end to end ffi flow working 2024-07-12 19:25:40 +00:00
Morgan Funtowicz b846ae2d9e use external fmt lib 2024-07-12 19:24:59 +00:00
Morgan Funtowicz 1972669f49 remove fmt import 2024-07-12 19:24:09 +00:00
Morgan Funtowicz 50e9fc89c8 working setup of the ffi layer 2024-07-11 21:24:32 +00:00
Morgan Funtowicz 5aede911f8 include guard to build example in cmakelists 2024-07-11 21:24:01 +00:00
Morgan Funtowicz ed14bd6818 use correct include for spdlog 2024-07-10 13:57:31 +00:00
Morgan Funtowicz 42748d5960 allow converting huggingface::tokenizers error to TensorRtLlmBackendError 2024-07-10 13:56:57 +00:00
Morgan Funtowicz 40fe2ec0ff add auth_token CLI argument to provide hf hub authentification token 2024-07-10 13:50:28 +00:00