Commit Graph

21 Commits

Author SHA1 Message Date
OlivierDehaene 757223b352
feat: add SchedulerV3 (#1996)
- Refactor code to allow supporting multiple versions of the
generate.proto at the same time
- Add v3/generate.proto (ISO to generate.proto for now but allow for
future changes without impacting v2 backends)
- Add Schedule trait to abstract queuing and batching mechanisms that
will be different in the future
- Add SchedulerV2/V3 impl
2024-06-04 15:56:56 +02:00
Daniël de Kok df71aafdcc router: send the input as chunks to the backend
Before this change, the generation input was sent to the backend as a
single string, encoding images as Base64 and packing them in
Markdown-style links.

This change adds a new chunked input representation that separates text
chunks from images chunks. Image chunks contain binary data (for smaller
message sizes) and the image's MIME type.

The stringly-typed inputs are still sent to support backends that do not
support chunked inputs yet.
2024-06-03 17:02:41 +02:00
Nicolas Patry a049864270
Preping 1.1.0 (#1066)
# What does this PR do?

Upgrade all relevant versions and dependencies.

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2023-09-27 10:40:18 +02:00
OlivierDehaene e28a809004
v0.9.0 (#525) 2023-07-01 19:25:41 +02:00
OlivierDehaene e250282213
feat(docker): add benchmarking tool to docker image (#298) 2023-05-09 13:19:31 +02:00
OlivierDehaene 6ded76a4ae
v0.6.0 (#222) 2023-04-21 21:00:57 +02:00
OlivierDehaene 6f0f1d70f6
v0.5.0 (#168) 2023-04-11 20:32:18 +02:00
OlivierDehaene fef1a1c381
v0.4.3 (#152) 2023-03-30 17:28:14 +02:00
OlivierDehaene 84722f3e33
v0.4.2 (#151) 2023-03-30 17:10:01 +02:00
OlivierDehaene ab5fd8cf93
v0.4.1 (#140) 2023-03-26 16:37:51 +02:00
OlivierDehaene 411d6247f4
v0.4.0 (#119) 2023-03-09 16:07:01 +01:00
OlivierDehaene 1c19b0934e
v0.3.2 (#97) 2023-03-03 18:42:20 +01:00
OlivierDehaene 4b1c9720c0
v0.3.1 (#84) 2023-02-24 13:27:41 +01:00
OlivierDehaene c720555adc
v0.3.0 (#72) 2023-02-16 17:28:29 +01:00
OlivierDehaene 9af454142a
feat: add distributed tracing (#62) 2023-02-13 13:02:45 +01:00
OlivierDehaene 2fe5e1b30e
V0.2.1 (#58) 2023-02-07 15:40:25 +01:00
OlivierDehaene 20c3c5940c
feat(router): refactor API and add openAPI schemas (#53) 2023-02-03 12:43:37 +01:00
OlivierDehaene 3cf6368c77 feat(server): Support all AutoModelForCausalLM on a best effort basis 2022-10-28 19:24:00 +02:00
Olivier Dehaene f16f2f5ae1 v0.1.0 2022-10-20 19:14:44 +02:00
Olivier Dehaene 39df4d9975 Use axum 2022-10-11 18:14:39 +02:00
Olivier Dehaene 295831a481 Init 2022-10-08 12:30:12 +02:00