* Add pytest release marker
Annotate a test with `@pytest.mark.release` and it only gets run
with `pytest integration-tests --release`.
* Mark many models as `release` to speed up CI
Mostly straightforward, changes to existing code:
* Wrap quantizer parameters in a small wrapper to avoid passing
around untyped tuples and needing to repack them as a dict.
* Move scratch space computation to warmup, because we need the
maximum input sequence length to avoid allocating huge
scratch buffers that OOM.