hf_text-generation-inference/.github/workflows/tests.yaml

name: Server Tests

on:
  pull_request:
    paths:
      - ".github/workflows/tests.yaml"
      - "server/**"
      - "proto/**"
      - "router/**"
      - "launcher/**"
      - "Cargo.lock"
      - "rust-toolchain.toml"

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  run_tests:
    runs-on:
      group: aws-highmemory-32-plus-priv
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python
        uses: actions/setup-python@v4
        id: python
        with:
          python-version: 3.11
      - name: Install Rust
        uses: actions-rs/toolchain@v1
        with:
          # Released on: 02 May, 2024
          # https://releases.rs/docs/1.78.0/
          toolchain: 1.80.0
          override: true
          components: rustfmt, clippy
      - name: Install Protoc
        uses: arduino/setup-protoc@v1
      - name: Clean unused files
        run: |
          sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android
          sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET
      - name: Install
        run: |
          sudo apt update
          sudo apt install python3.11-dev -y
          make install-cpu
      - name: Run server tests
        run: |
          pip install pytest
          export HF_TOKEN=${{ secrets.HF_TOKEN }}
          pytest -s -vv server/tests
      - name: Pre-commit checks
        run: |
          pip install pre-commit
          pre-commit install
          pre-commit run --all-files
      - name: Run Rust tests
        run: |
          cargo test
      - name: Run Rust tests with google feature
        run: |
          cargo test --features google
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`name: Server Tests`

			`on:`
			`pull_request:`
			`paths:`
feat(ci): improve CI speed (#94) 2023-03-03 07:07:27 -07:00			`- ".github/workflows/tests.yaml"`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`- "server/**"`
			`- "proto/**"`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- "router/**"`
			`- "launcher/**"`
feat(ci): add ci paths (#134) 2023-03-23 11:01:30 -06:00			`- "Cargo.lock"`
			`- "rust-toolchain.toml"`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00
feat(server): flash neoX (#133) 2023-03-24 07:02:14 -06:00			`concurrency:`
			`group: ${{ github.workflow }}-${{ github.head_ref \|\| github.run_id }}`
			`cancel-in-progress: true`

feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`jobs:`
			`run_tests:`
Fix tokenization yi (#2507) * Fixing odd tokenization self modifications on the Rust side (load and resave in Python). * Fixing the builds ? * Fix the gh action? * Fixing the location ? * Validation is odd. * Try a faster runner * Upgrade python version. * Remove sccache * No sccache. * Getting libpython maybe ? * List stuff. * Monkey it up. * have no idea at this point * Tmp. * Shot in the dark. * Tmate the hell out of this. * Desperation. * WTF. * -y. * Apparently 3.10 is not available anymore. * Updating the dockerfile to make libpython discoverable at runtime too. * Put back rust tests. * Why do we want mkl on AMD ? * Forcing 3.11 ? 2024-09-11 14:41:56 -06:00			`runs-on:`
			`group: aws-highmemory-32-plus-priv`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`steps:`
			`- uses: actions/checkout@v2`
			`- name: Set up Python`
Fix tokenization yi (#2507) * Fixing odd tokenization self modifications on the Rust side (load and resave in Python). * Fixing the builds ? * Fix the gh action? * Fixing the location ? * Validation is odd. * Try a faster runner * Upgrade python version. * Remove sccache * No sccache. * Getting libpython maybe ? * List stuff. * Monkey it up. * have no idea at this point * Tmp. * Shot in the dark. * Tmate the hell out of this. * Desperation. * WTF. * -y. * Apparently 3.10 is not available anymore. * Updating the dockerfile to make libpython discoverable at runtime too. * Put back rust tests. * Why do we want mkl on AMD ? * Forcing 3.11 ? 2024-09-11 14:41:56 -06:00			`uses: actions/setup-python@v4`
			`id: python`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`with:`
Fix tokenization yi (#2507) * Fixing odd tokenization self modifications on the Rust side (load and resave in Python). * Fixing the builds ? * Fix the gh action? * Fixing the location ? * Validation is odd. * Try a faster runner * Upgrade python version. * Remove sccache * No sccache. * Getting libpython maybe ? * List stuff. * Monkey it up. * have no idea at this point * Tmp. * Shot in the dark. * Tmate the hell out of this. * Desperation. * WTF. * -y. * Apparently 3.10 is not available anymore. * Updating the dockerfile to make libpython discoverable at runtime too. * Put back rust tests. * Why do we want mkl on AMD ? * Forcing 3.11 ? 2024-09-11 14:41:56 -06:00			`python-version: 3.11`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Install Rust`
			`uses: actions-rs/toolchain@v1`
			`with:`
New runner. Manual squash. (#2110) * New runner. Manual squash. * Network host. * Put back trufflehog with proper extension. * No network host ? * Moving buildx install after tailscale ? * 1.79 2024-06-24 10:08:34 -06:00			`# Released on: 02 May, 2024`
			`# https://releases.rs/docs/1.78.0/`
Lots of improvements (Still 2 allocators) (#2449) * Making prefix/flashinfer the default and testing the full release tests. * Include flashinfer in the docker. * Using prebuilt. * Allowing window_left_size (dummy version). * Disabling flashinfer/prefix caching on odd head_dim * Disable prefix caching for lora. * More specific codes. * Update lock * Updating integration tests with new values with FI/FD. Remove paged as a default too, and using FD everywhere. * Update cargo lock ? * Upgrade to 1.80 because of bitstream... * Everywhere 1.80 * Forgot last default place. * Apply suggestions from code review Co-authored-by: drbh <david.richard.holtz@gmail.com> * Updated flake lock * Tmp * Upgrade resolution system for less errors in resolution. * Remove lambda for cleaner function. * Handling debugger. * OVerride the env in server tests. * Is this enough to make it work ? * This seems to be working. * Downgrade some logs. * Fixing the default for vlm. * Don't enable prefix caching on VLM just yet. * Change `add_special_tokens` in order to have the correct tokens for chat input and not (since it's super important with the prefixing now) * Fixing prefix caching for flashdecoding. * Update all models. * Fixed flashinfer version. * add_special_tokens is internal only * Fixing seqlen with the new vlms. * Fixing the issue with `add_special_tokens` not being passed around. * Fixing the test. * Removing encoder_decoder (seq2seq). * Update the chat test. * Fixing the batching tokenization in flash causal lm. * Truncating left for radix purposes. * Oops this doesn't belong here. * Put back default pure shell. * Update server tests - Default to throughput test in k6 - Use TGI_WIGGLE_ROOM to adjust wiggle room * Only n_heads / process_group.size() are necessary. * Revert the integrationt tests change (seem linked to head_size modification). * Adding error message when assert is violated. * Fixing the free algorithm to handle times where the common prefix is smaller. * Apply suggestions from code review Co-authored-by: OlivierDehaene <olivier@huggingface.co> * Update server/text_generation_server/layers/attention/common.py Co-authored-by: OlivierDehaene <olivier@huggingface.co> * Fix disabling prefix caching - Fix windowing checks. * Revert the Cohere tokenizer change (for now using a revision instead). * Fmt. --------- Co-authored-by: drbh <david.richard.holtz@gmail.com> Co-authored-by: OlivierDehaene <olivier@huggingface.co> 2024-08-29 08:29:01 -06:00			`toolchain: 1.80.0`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`override: true`
			`components: rustfmt, clippy`
feat: add distributed tracing (#62) 2023-02-13 05:02:45 -07:00			`- name: Install Protoc`
			`uses: arduino/setup-protoc@v1`
fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 03:05:32 -07:00			`- name: Clean unused files`
			`run: \|`
			`sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android`
			`sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Install`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`run: \|`
Stream options. (#2533) * Stream options. * Fetch stuff from nix integration test for easier testing. * Adding the assert. * Only send the usage when asked for. * Update the docs. * Impure test because we need network. * develop. * Optional usage. * Fixes. * Workflow 2024-09-19 12:50:37 -06:00			`sudo apt update`
Fix tokenization yi (#2507) * Fixing odd tokenization self modifications on the Rust side (load and resave in Python). * Fixing the builds ? * Fix the gh action? * Fixing the location ? * Validation is odd. * Try a faster runner * Upgrade python version. * Remove sccache * No sccache. * Getting libpython maybe ? * List stuff. * Monkey it up. * have no idea at this point * Tmp. * Shot in the dark. * Tmate the hell out of this. * Desperation. * WTF. * -y. * Apparently 3.10 is not available anymore. * Updating the dockerfile to make libpython discoverable at runtime too. * Put back rust tests. * Why do we want mkl on AMD ? * Forcing 3.11 ? 2024-09-11 14:41:56 -06:00			`sudo apt install python3.11-dev -y`
Making `make install` work better by default. (#2004) # What does this PR do? Making `make install` a much better sane default to start local dev environments. <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> 2024-06-04 11:38:46 -06:00			`make install-cpu`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Run server tests`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`run: \|`
			`pip install pytest`
Removing IPEX_AVAIL. (#2115) * Removing IPEX_AVAIL. Chose to unify CPU and XPU under `ipex`. Most code is exactly similar except for a very few spots. The biggest number of spots is the kv-cache layout and the flash_xxx.py files. Since those files should be removed soon and factored away, we should not need them. * Forgot a few places. * Unrelated change. * Fixing HF_TOKEN. * HF_TOKEN 2024-06-25 05:20:57 -06:00			`export HF_TOKEN=${{ secrets.HF_TOKEN }}`
fix(server): fix decode token (#334) Fixes #333 --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> 2023-05-16 15:23:27 -06:00			`pytest -s -vv server/tests`
chore: add pre-commit (#1569) 2024-02-16 03:58:58 -07:00			`- name: Pre-commit checks`
feat(router): new healthcheck that skips the queue (#244) Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> Co-authored-by: OlivierDehaene <olivier@huggingface.co> 2023-04-26 12:23:54 -06:00			`run: \|`
chore: add pre-commit (#1569) 2024-02-16 03:58:58 -07:00			`pip install pre-commit`
			`pre-commit install`
			`pre-commit run --all-files`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Run Rust tests`
			`run: \|`
			`cargo test`
Fix build with `--features google` (#2566) * Fix `cargo build --features google` * Add `cargo test --features google` 2024-09-26 03:41:38 -06:00			`- name: Run Rust tests with google feature`
			`run: \|`
			`cargo test --features google`