hf_text-generation-inference/.github/workflows/tests.yaml

name: Server Tests

on:
  pull_request:
    paths:
      - ".github/workflows/tests.yaml"
      - "server/**"
      - "proto/**"
      - "router/**"
      - "launcher/**"
      - "Cargo.lock"
      - "rust-toolchain.toml"

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true

jobs:
  run_tests:
    runs-on: ubuntu-latest

    env:
      SCCACHE_GHA_ENABLED: "on"
      RUSTC_WRAPPER: /usr/local/bin/sccache
      SCCACHE: 0.3.3

    steps:
      - uses: actions/checkout@v2
      - name: Set up Python
        uses: actions/setup-python@v1
        with:
          python-version: 3.9
      - name: Install Rust
        uses: actions-rs/toolchain@v1
        with:
          # Released on: 02 May, 2024
          # https://releases.rs/docs/1.78.0/
          toolchain: 1.80.0
          override: true
          components: rustfmt, clippy
      - name: Install Protoc
        uses: arduino/setup-protoc@v1
      - name: Clean unused files
        run: |
          sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android
          sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET
      - name: Install sccache
        run: |
          curl -fsSL https://github.com/mozilla/sccache/releases/download/v$SCCACHE/sccache-v$SCCACHE-x86_64-unknown-linux-musl.tar.gz | tar -xzv --strip-components=1 -C /usr/local/bin sccache-v$SCCACHE-x86_64-unknown-linux-musl/sccache
          chmod +x /usr/local/bin/sccache
      - name: configure sccache
        uses: actions/github-script@v6
        with:
          script: |
            core.exportVariable('ACTIONS_CACHE_URL', process.env.ACTIONS_CACHE_URL || '');
            core.exportVariable('ACTIONS_RUNTIME_TOKEN', process.env.ACTIONS_RUNTIME_TOKEN || '');
            core.exportVariable('SCCACHE_GHA_CACHE_TO', 'sccache-${{runner.os}}-${{github.ref_name}}');
            core.exportVariable('SCCACHE_GHA_CACHE_FROM', 'sccache-${{runner.os}}-main,sccache-${{runner.os}}-');
      - name: cargo registry cache
        uses: actions/cache@v3
        with:
          key: cargo-${{ runner.os }}-${{ hashFiles('**/Cargo.toml') }}-${{ github.sha }}
          restore-keys: |
            cargo-${{ runner.os }}-${{ hashFiles('**/Cargo.toml') }}-
            cargo-${{ runner.os }}-
          path: |
            ~/.cargo/registry
            ~/.cargo/git
      - name: Install
        run: |
          make install-cpu
      - name: Run server tests
        run: |
          pip install pytest
          export HF_TOKEN=${{ secrets.HF_TOKEN }}
          pytest -s -vv server/tests
      - name: Pre-commit checks
        run: |
          pip install pre-commit
          pre-commit install
          pre-commit run --all-files
      - name: Run Rust tests
        run: |
          cargo test
      - name: sccache stats
        run: |
          /usr/local/bin/sccache --show-stats
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`name: Server Tests`

			`on:`
			`pull_request:`
			`paths:`
feat(ci): improve CI speed (#94) 2023-03-03 07:07:27 -07:00			`- ".github/workflows/tests.yaml"`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`- "server/**"`
			`- "proto/**"`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- "router/**"`
			`- "launcher/**"`
feat(ci): add ci paths (#134) 2023-03-23 11:01:30 -06:00			`- "Cargo.lock"`
			`- "rust-toolchain.toml"`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00
feat(server): flash neoX (#133) 2023-03-24 07:02:14 -06:00			`concurrency:`
			`group: ${{ github.workflow }}-${{ github.head_ref \|\| github.run_id }}`
			`cancel-in-progress: true`

feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`jobs:`
			`run_tests:`
fea(dockerfile): better layer caching (#159) 2023-04-14 02:12:21 -06:00			`runs-on: ubuntu-latest`
feat(ci): improve CI speed (#94) 2023-03-03 07:07:27 -07:00
			`env:`
			`SCCACHE_GHA_ENABLED: "on"`
			`RUSTC_WRAPPER: /usr/local/bin/sccache`
			`SCCACHE: 0.3.3`

feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`steps:`
			`- uses: actions/checkout@v2`
			`- name: Set up Python`
			`uses: actions/setup-python@v1`
			`with:`
			`python-version: 3.9`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Install Rust`
			`uses: actions-rs/toolchain@v1`
			`with:`
New runner. Manual squash. (#2110) * New runner. Manual squash. * Network host. * Put back trufflehog with proper extension. * No network host ? * Moving buildx install after tailscale ? * 1.79 2024-06-24 10:08:34 -06:00			`# Released on: 02 May, 2024`
			`# https://releases.rs/docs/1.78.0/`
Lots of improvements (Still 2 allocators) (#2449) * Making prefix/flashinfer the default and testing the full release tests. * Include flashinfer in the docker. * Using prebuilt. * Allowing window_left_size (dummy version). * Disabling flashinfer/prefix caching on odd head_dim * Disable prefix caching for lora. * More specific codes. * Update lock * Updating integration tests with new values with FI/FD. Remove paged as a default too, and using FD everywhere. * Update cargo lock ? * Upgrade to 1.80 because of bitstream... * Everywhere 1.80 * Forgot last default place. * Apply suggestions from code review Co-authored-by: drbh <david.richard.holtz@gmail.com> * Updated flake lock * Tmp * Upgrade resolution system for less errors in resolution. * Remove lambda for cleaner function. * Handling debugger. * OVerride the env in server tests. * Is this enough to make it work ? * This seems to be working. * Downgrade some logs. * Fixing the default for vlm. * Don't enable prefix caching on VLM just yet. * Change `add_special_tokens` in order to have the correct tokens for chat input and not (since it's super important with the prefixing now) * Fixing prefix caching for flashdecoding. * Update all models. * Fixed flashinfer version. * add_special_tokens is internal only * Fixing seqlen with the new vlms. * Fixing the issue with `add_special_tokens` not being passed around. * Fixing the test. * Removing encoder_decoder (seq2seq). * Update the chat test. * Fixing the batching tokenization in flash causal lm. * Truncating left for radix purposes. * Oops this doesn't belong here. * Put back default pure shell. * Update server tests - Default to throughput test in k6 - Use TGI_WIGGLE_ROOM to adjust wiggle room * Only n_heads / process_group.size() are necessary. * Revert the integrationt tests change (seem linked to head_size modification). * Adding error message when assert is violated. * Fixing the free algorithm to handle times where the common prefix is smaller. * Apply suggestions from code review Co-authored-by: OlivierDehaene <olivier@huggingface.co> * Update server/text_generation_server/layers/attention/common.py Co-authored-by: OlivierDehaene <olivier@huggingface.co> * Fix disabling prefix caching - Fix windowing checks. * Revert the Cohere tokenizer change (for now using a revision instead). * Fmt. --------- Co-authored-by: drbh <david.richard.holtz@gmail.com> Co-authored-by: OlivierDehaene <olivier@huggingface.co> 2024-08-29 08:29:01 -06:00			`toolchain: 1.80.0`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`override: true`
			`components: rustfmt, clippy`
feat: add distributed tracing (#62) 2023-02-13 05:02:45 -07:00			`- name: Install Protoc`
			`uses: arduino/setup-protoc@v1`
fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 03:05:32 -07:00			`- name: Clean unused files`
			`run: \|`
			`sudo rm -rf /usr/local/lib/android # will release about 10 GB if you don't need Android`
			`sudo rm -rf /usr/share/dotnet # will release about 20GB if you don't need .NET`
feat(ci): improve CI speed (#94) 2023-03-03 07:07:27 -07:00			`- name: Install sccache`
			`run: \|`
			`curl -fsSL https://github.com/mozilla/sccache/releases/download/v$SCCACHE/sccache-v$SCCACHE-x86_64-unknown-linux-musl.tar.gz \| tar -xzv --strip-components=1 -C /usr/local/bin sccache-v$SCCACHE-x86_64-unknown-linux-musl/sccache`
			`chmod +x /usr/local/bin/sccache`
			`- name: configure sccache`
			`uses: actions/github-script@v6`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`with:`
feat(ci): improve CI speed (#94) 2023-03-03 07:07:27 -07:00			`script: \|`
			`core.exportVariable('ACTIONS_CACHE_URL', process.env.ACTIONS_CACHE_URL \|\| '');`
			`core.exportVariable('ACTIONS_RUNTIME_TOKEN', process.env.ACTIONS_RUNTIME_TOKEN \|\| '');`
			`core.exportVariable('SCCACHE_GHA_CACHE_TO', 'sccache-${{runner.os}}-${{github.ref_name}}');`
			`core.exportVariable('SCCACHE_GHA_CACHE_FROM', 'sccache-${{runner.os}}-main,sccache-${{runner.os}}-');`
			`- name: cargo registry cache`
			`uses: actions/cache@v3`
			`with:`
			`key: cargo-${{ runner.os }}-${{ hashFiles('**/Cargo.toml') }}-${{ github.sha }}`
			`restore-keys: \|`
			`cargo-${{ runner.os }}-${{ hashFiles('**/Cargo.toml') }}-`
			`cargo-${{ runner.os }}-`
			`path: \|`
			`~/.cargo/registry`
			`~/.cargo/git`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Install`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`run: \|`
Making `make install` work better by default. (#2004) # What does this PR do? Making `make install` a much better sane default to start local dev environments. <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> 2024-06-04 11:38:46 -06:00			`make install-cpu`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Run server tests`
feat: Return logprobs (#8) 2022-12-15 09:03:56 -07:00			`run: \|`
			`pip install pytest`
Removing IPEX_AVAIL. (#2115) * Removing IPEX_AVAIL. Chose to unify CPU and XPU under `ipex`. Most code is exactly similar except for a very few spots. The biggest number of spots is the kv-cache layout and the flash_xxx.py files. Since those files should be removed soon and factored away, we should not need them. * Forgot a few places. * Unrelated change. * Fixing HF_TOKEN. * HF_TOKEN 2024-06-25 05:20:57 -06:00			`export HF_TOKEN=${{ secrets.HF_TOKEN }}`
fix(server): fix decode token (#334) Fixes #333 --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> 2023-05-16 15:23:27 -06:00			`pytest -s -vv server/tests`
chore: add pre-commit (#1569) 2024-02-16 03:58:58 -07:00			`- name: Pre-commit checks`
feat(router): new healthcheck that skips the queue (#244) Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> Co-authored-by: OlivierDehaene <olivier@huggingface.co> 2023-04-26 12:23:54 -06:00			`run: \|`
chore: add pre-commit (#1569) 2024-02-16 03:58:58 -07:00			`pip install pre-commit`
			`pre-commit install`
			`pre-commit run --all-files`
feat(launcher): Add integration tests (#9) 2022-12-16 03:29:36 -07:00			`- name: Run Rust tests`
			`run: \|`
			`cargo test`
feat(ci): improve CI speed (#94) 2023-03-03 07:07:27 -07:00			`- name: sccache stats`
			`run: \|`
			`/usr/local/bin/sccache --show-stats`