Reverse proxy server for various LLM APIs. Features translation between API formats, user management, anti-abuse, API key rotation, DALL-E support, and optional prompt/response logging.
Go to file
nai-degen 81f1e2bc37 fixes broken GET models endpoint for openai/mistral 2024-01-14 05:33:24 -06:00
.husky Add temporary user tokens (khanon/oai-reverse-proxy!42) 2023-09-09 22:21:38 +00:00
data OpenAI DALL-E Image Generation (khanon/oai-reverse-proxy!52) 2023-11-14 05:41:19 +00:00
docker corrects nodejs max heap memory config 2024-01-07 16:16:27 -06:00
docs allows configurable trust proxy setting for Render deployments 2024-01-08 16:39:28 -06:00
scripts refactors infopage 2023-12-16 20:30:20 -06:00
src fixes broken GET models endpoint for openai/mistral 2024-01-14 05:33:24 -06:00
.env.example adds TRUSTED_PROXIES to .env.example 2024-01-08 16:41:30 -06:00
.gitattributes initial commit 2023-04-08 01:54:44 -05:00
.gitignore updates dotenv 2024-01-08 23:25:02 -06:00
.prettierrc Implement AWS Bedrock support (khanon/oai-reverse-proxy!45) 2023-10-01 01:40:18 +00:00
README.md updates README.md 2024-01-08 19:36:35 -06:00
http-client.env.json Azure OpenAI suport (khanon/oai-reverse-proxy!48) 2023-12-04 04:21:18 +00:00
package-lock.json fixes sourcemap dependency in package.json 2024-01-09 00:32:34 -06:00
package.json fixes sourcemap dependency in package.json 2024-01-09 00:32:34 -06:00
render.yaml Add docs and support for Render.com deployments (khanon/oai-reverse-proxy!9) 2023-05-15 21:47:30 +00:00
tsconfig.json refactors infopage 2023-12-16 20:30:20 -06:00

README.md

OAI Reverse Proxy

Reverse proxy server for various LLM APIs.

Table of Contents

What is this?

This project allows you to run a reverse proxy server for various LLM APIs.

Features

  • Support for multiple APIs
  • Translation from OpenAI-formatted prompts to any other API, including streaming responses
  • Multiple API keys with rotation and rate limit handling
  • Basic user management
    • Simple role-based permissions
    • Per-model token quotas
    • Temporary user accounts
  • Prompt and completion logging
  • Abuse detection and prevention

Usage Instructions

If you'd like to run your own instance of this server, you'll need to deploy it somewhere and configure it with your API keys. A few easy options are provided below, though you can also deploy it to any other service you'd like if you know what you're doing and the service supports Node.js.

Self-hosting (locally or without Docker)

Follow the "Local Development" instructions below to set up prerequisites and start the server. Then you can use a service like ngrok or trycloudflare.com to securely expose your server to the internet, or you can use a more traditional reverse proxy/WAF like Cloudflare or Nginx.

Ensure you set the TRUSTED_PROXIES environment variable according to your deployment. Refer to .env.example and config.ts for more information.

Self hosting (with Docker)

If you have a Docker-capable VPS or server, use the Huggingface Dockerfile (./docker/huggingface/Dockerfile) to build an image and run it on your server.

Ensure you set the TRUSTED_PROXIES environment variable according to your deployment. Refer to .env.example and config.ts for more information.

Alternatives

Fiz and Sekrit are working on some alternative ways to deploy this conveniently. While I'm not directly involved in writing code or scripts for that project, I'm providing some advice and will include links to their work here when it's ready.

Huggingface (not advised)

See here for instructions on how to deploy to a Huggingface Space.

Render (not advised)

See here for instructions on how to deploy to Render.com.

Local Development

To run the proxy locally for development or testing, install Node.js >= 18.0.0 and follow the steps below.

  1. Clone the repo
  2. Install dependencies with npm install
  3. Create a .env file in the root of the project and add your API keys. See the .env.example file for an example.
  4. Start the server in development mode with npm run start:dev.

You can also use npm run start:dev:tsc to enable project-wide type checking at the cost of slower startup times. npm run type-check can be used to run type checking without starting the server.