3.5 KiB
OAI Reverse Proxy
Reverse proxy server for various LLM APIs.
Table of Contents
What is this?
This project allows you to run a reverse proxy server for various LLM APIs.
Features
- Support for multiple APIs
- Translation from OpenAI-formatted prompts to any other API, including streaming responses
- Multiple API keys with rotation and rate limit handling
- Basic user management
- Simple role-based permissions
- Per-model token quotas
- Temporary user accounts
- Prompt and completion logging
- Abuse detection and prevention
Usage Instructions
If you'd like to run your own instance of this server, you'll need to deploy it somewhere and configure it with your API keys. A few easy options are provided below, though you can also deploy it to any other service you'd like if you know what you're doing and the service supports Node.js.
Self-hosting
See here for instructions on how to self-host the application on your own VPS or local machine.
Ensure you set the TRUSTED_PROXIES
environment variable according to your deployment. Refer to .env.example and config.ts for more information.
Huggingface (outdated, not advised)
See here for instructions on how to deploy to a Huggingface Space.
Render (outdated, not advised)
See here for instructions on how to deploy to Render.com.
Local Development
To run the proxy locally for development or testing, install Node.js >= 18.0.0 and follow the steps below.
- Clone the repo
- Install dependencies with
npm install
- Create a
.env
file in the root of the project and add your API keys. See the .env.example file for an example. - Start the server in development mode with
npm run start:dev
.
You can also use npm run start:dev:tsc
to enable project-wide type checking at the cost of slower startup times. npm run type-check
can be used to run type checking without starting the server.
Building
To build the project, run npm run build
. This will compile the TypeScript code to JavaScript and output it to the build
directory.
Note that if you are trying to build the server on a very memory-constrained (<= 1GB) VPS, you may need to run the build with NODE_OPTIONS=--max_old_space_size=2048 npm run build
to avoid running out of memory during the build process, assuming you have swap enabled. The application itself should run fine on a 512MB VPS for most reasonable traffic levels.
Forking
If you are forking the repository on GitGud, you may wish to disable GitLab CI/CD or you will be spammed with emails about failed builds due not having any CI runners. You can do this by going to Settings > General > Visibility, project features, permissions and then disabling the "CI/CD" feature.