drbh
|
cef0553d59
|
Outlines guided generation (#1539)
This WIP PR starts to add grammar support via outlines, currently this
PR supports very simple regex grammars and does not optimize for
precompiling or caching grammar fsm's.
todo:
- [X] add simple outlines guidance to `NextTokenChooser`
- [X] update protos for grammar
- [X] update generation params API
- [X] constrain simple grammar
- [ ] support parsing more complex grammar into fsm
- [ ] support all outline support grammar types
- [ ] explore optimizations to avoid recompiling grammars
guided request
```bash
curl -s 'http://localhost:3000/generate' \
--header 'Content-Type: application/json' \
--data-raw '{
"inputs": "make an email for david: \n",
"parameters": {
"max_new_tokens": 6,
"grammar": "[\\w-]+@([\\w-]+\\.)+[\\w-]+"
}
}' | jq
```
response
```json
{
"generated_text": "david@example.com"
}
```
unguided request
```bash
curl -s 'http://localhost:3000/generate' \
--header 'Content-Type: application/json' \
--data '{
"inputs": "make an email for david: \n",
"parameters": {
"max_new_tokens": 6
}
}' | jq
```
response
```json
{
"generated_text": " email = 'david"
}
```
|
2024-02-15 10:28:10 +01:00 |
Nicolas Patry
|
9ecfa16b12
|
Speculative (#1308)
|
2023-12-11 12:46:30 +01:00 |
OlivierDehaene
|
218c9adaa5
|
feat: decrease IPC proto size (#367)
Closes #307 #308
|
2023-05-24 19:19:57 +02:00 |
OlivierDehaene
|
68e9d6ab33
|
feat(server): shard token decode (#303)
|
2023-05-10 15:48:21 +02:00 |
Nicolas Patry
|
db2b4e0754
|
feat(router): new healthcheck that skips the queue (#244)
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
|
2023-04-26 20:23:54 +02:00 |
OlivierDehaene
|
343437c7b5
|
feat(router): add device and dtype info (#215)
|
2023-04-21 15:36:29 +02:00 |
OlivierDehaene
|
9af454142a
|
feat: add distributed tracing (#62)
|
2023-02-13 13:02:45 +01:00 |
OlivierDehaene
|
20c3c5940c
|
feat(router): refactor API and add openAPI schemas (#53)
|
2023-02-03 12:43:37 +01:00 |
OlivierDehaene
|
017a2a8c2f
|
feat: Add token streaming using ServerSideEvents support (#41)
|
2023-01-31 17:04:00 +01:00 |
OlivierDehaene
|
4f9ac67cfa
|
Revert "feat: Add token streaming using ServerSideEvents support" (#40)
Reverts huggingface/text-generation-inference#36
|
2023-01-31 14:21:51 +01:00 |
OlivierDehaene
|
7fbfbb0dc5
|
feat: Add token streaming using ServerSideEvents support (#36)
Add token streaming using ServerSideEvents (SSE).
The signature of the SSE events is:
```rust
struct Details {
finish_reason: String,
generated_tokens: u32,
seed: Option<u64>,
}
struct StreamResponse {
token: Token,
generated_text: Option<String>,
details: Option<Details>,
}
struct ErrorResponse {
error: String,
}
```
|
2023-01-31 11:49:43 +01:00 |
OlivierDehaene
|
32a253063d
|
feat: Return logprobs (#8)
|
2022-12-15 17:03:56 +01:00 |
OlivierDehaene
|
718096f695
|
feat: Support stop sequences (#7)
|
2022-12-12 18:25:22 +01:00 |
Olivier Dehaene
|
f16f2f5ae1
|
v0.1.0
|
2022-10-20 19:14:44 +02:00 |
Olivier Dehaene
|
5e5d8766a2
|
feat: Improve error handling
|
2022-10-17 14:59:00 +02:00 |
Olivier Dehaene
|
4c693e6524
|
Refactored gRPC interface
Added validation logic
|
2022-10-11 16:50:54 +02:00 |
Olivier Dehaene
|
295831a481
|
Init
|
2022-10-08 12:30:12 +02:00 |