hf_text-generation-inference/router/src
Nicolas Patry 0204946d26
Max token capacity metric (#2595)
* adding max_token_capacity_metric

* added tgi to name of metric

* Adding max capacity metric.

* Add description for the metrics

---------

Co-authored-by: Edwinhr716 <Edandres249@gmail.com>
2024-10-02 16:32:36 +02:00
..
infer Mllama flash version (#2585) 2024-10-02 11:22:13 +02:00
config.rs Mllama flash version (#2585) 2024-10-02 11:22:13 +02:00
kserve.rs fix: simplify kserve endpoint and fix imports (#2119) 2024-06-25 19:30:10 -04:00
lib.rs Cleanup Vertex + Chat (#2553) 2024-09-24 23:37:17 +02:00
logging.rs Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
main.rs.back Rebase TRT-llm (#2331) 2024-07-31 10:33:10 +02:00
server.rs Max token capacity metric (#2595) 2024-10-02 16:32:36 +02:00
usage_stats.rs refactor usage stats (#2339) 2024-07-31 16:29:07 +02:00
validation.rs Mllama flash version (#2585) 2024-10-02 11:22:13 +02:00
vertex.rs Cleanup Vertex + Chat (#2553) 2024-09-24 23:37:17 +02:00