Reverse proxy server for various LLM APIs. Features translation between API formats, user management, anti-abuse, API key rotation, DALL-E support, and optional prompt/response logging.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
Go to file
devvnull d3f7c675e3 add pricing for Azure GPT counterparts and update Claude pricing (khanon/oai-reverse-proxy!65) 3 days ago
.husky Add temporary user tokens (khanon/oai-reverse-proxy!42) 6 months ago
data OpenAI DALL-E Image Generation (khanon/oai-reverse-proxy!52) 3 months ago
docker fixes CI image tagging on main branch 1 month ago
docs Add Gitlab CI and self-hosting instructions (khanon/oai-reverse-proxy!61) 1 month ago
scripts refactors infopage 2 months ago
src add pricing for Azure GPT counterparts and update Claude pricing (khanon/oai-reverse-proxy!65) 3 days ago
.env.example default claude 2.1 instead of 1.3 in openai compat endpoint since 1.3 is not accessible on all keys 1 month ago
.gitattributes initial commit 11 months ago
.gitignore updates dotenv 2 months ago
.prettierrc Implement AWS Bedrock support (khanon/oai-reverse-proxy!45) 5 months ago
README.md updates README with building/forking info [skip-ci] 1 month ago
http-client.env.json Azure OpenAI suport (khanon/oai-reverse-proxy!48) 3 months ago
package-lock.json uses EventStreamMarshaller from AWS SDK to hopefully handle split messages 2 weeks ago
package.json uses EventStreamMarshaller from AWS SDK to hopefully handle split messages 2 weeks ago
render.yaml Add docs and support for Render.com deployments (khanon/oai-reverse-proxy!9) 9 months ago
tsconfig.json refactors infopage 2 months ago

README.md

OAI Reverse Proxy

Reverse proxy server for various LLM APIs.

Table of Contents

What is this?

This project allows you to run a reverse proxy server for various LLM APIs.

Features

  • Support for multiple APIs
  • Translation from OpenAI-formatted prompts to any other API, including streaming responses
  • Multiple API keys with rotation and rate limit handling
  • Basic user management
    • Simple role-based permissions
    • Per-model token quotas
    • Temporary user accounts
  • Prompt and completion logging
  • Abuse detection and prevention

Usage Instructions

If you'd like to run your own instance of this server, you'll need to deploy it somewhere and configure it with your API keys. A few easy options are provided below, though you can also deploy it to any other service you'd like if you know what you're doing and the service supports Node.js.

Self-hosting

See here for instructions on how to self-host the application on your own VPS or local machine.

Ensure you set the TRUSTED_PROXIES environment variable according to your deployment. Refer to .env.example and config.ts for more information.

Alternatives

Fiz and Sekrit are working on some alternative ways to deploy this conveniently. While I'm not involved in this effort beyond providing technical advice regarding my code, I'll link to their work here for convenience: Sekrit's rentry

Huggingface (outdated, not advised)

See here for instructions on how to deploy to a Huggingface Space.

Render (outdated, not advised)

See here for instructions on how to deploy to Render.com.

Local Development

To run the proxy locally for development or testing, install Node.js >= 18.0.0 and follow the steps below.

  1. Clone the repo
  2. Install dependencies with npm install
  3. Create a .env file in the root of the project and add your API keys. See the .env.example file for an example.
  4. Start the server in development mode with npm run start:dev.

You can also use npm run start:dev:tsc to enable project-wide type checking at the cost of slower startup times. npm run type-check can be used to run type checking without starting the server.

Building

To build the project, run npm run build. This will compile the TypeScript code to JavaScript and output it to the build directory.

Note that if you are trying to build the server on a very memory-constrained (<= 1GB) VPS, you may need to run the build with NODE_OPTIONS=--max_old_space_size=2048 npm run build to avoid running out of memory during the build process, assuming you have swap enabled. The application itself should run fine on a 512MB VPS for most reasonable traffic levels.

Forking

If you are forking the repository on GitGud, you may wish to disable GitLab CI/CD or you will be spammed with emails about failed builds due not having any CI runners. You can do this by going to Settings > General > Visibility, project features, permissions and then disabling the "CI/CD" feature.