OpenLLM大模型开放平台

OpenLLM是一个用于在生产中操作大型语言模型 (LLM) 的开放平台，可以轻松地微调、服务、部署和监控任何LLM大模型。

1、OpenLLM的主要特性

借助 OpenLLM，你可以使用任何开源大语言模型运行推理，部署到云端或本地，并构建强大的 AI 应用程序。

🚂 最先进的 LLM：内置支持多种开源 LLM 和模型运行时，包括 Llama 2，StableLM、Falcon、Dolly、Flan-T5、ChatGLM、StarCoder 等。
🔥 灵活的 API：使用一个命令通过 RESTful API 或 gRPC 为 LLM 提供服务，通过 WebUI、CLI、Python/Javascript 客户端或任何 HTTP 客户端进行查询。
⛓️ 自由构建：对 LangChain、BentoML 和 Hugging Face 的一流支持，可以通过将 LLM 与其他模型和服务组合来轻松创建自己的 AI 应用程序。
🎯 简化部署：自动生成 LLM 服务器 Docker 映像或通过 ☁️ BentoCloud 部署为无服务器端点。
🤖️ 自带 LLM：使用 LLM.tuning() 微调任何 LLM 以满足你的需求。

2、OpenLLM快速上手

要使用 OpenLLM，需要在系统上安装 Python 3.8（或更高版本）和 pip。我们强烈建议使用虚拟环境来防止包冲突。

可以使用 pip 安装 OpenLLM，如下所示：

pip install openllm

要验证是否安装正确，请运行：

$ openllm -h

Usage: openllm [OPTIONS] COMMAND [ARGS]...

   ██████╗ ██████╗ ███████╗███╗   ██╗██╗     ██╗     ███╗   ███╗
  ██╔═══██╗██╔══██╗██╔════╝████╗  ██║██║     ██║     ████╗ ████║
  ██║   ██║██████╔╝█████╗  ██╔██╗ ██║██║     ██║     ██╔████╔██║
  ██║   ██║██╔═══╝ ██╔══╝  ██║╚██╗██║██║     ██║     ██║╚██╔╝██║
  ╚██████╔╝██║     ███████╗██║ ╚████║███████╗███████╗██║ ╚═╝ ██║
   ╚═════╝ ╚═╝     ╚══════╝╚═╝  ╚═══╝╚══════╝╚══════╝╚═╝     ╚═╝

  An open platform for operating large language models in production.
  Fine-tune, serve, deploy, and monitor any LLMs with ease.

要启动 LLM 服务器，请使用 openllm start。例如，要启动 OPT 服务器，请执行以下操作：

openllm start opt

之后，将可以通过 http://localhost:3000 访问 Web UI，你可以在其中试验访问端点和示例输入提示。

OpenLLM 提供内置的 Python 客户端，允许你与模型进行交互。在不同的终端窗口或 Jupyter Notebook 中，创建一个客户端以开始与模型交互：

import openllm
client = openllm.client.HTTPClient('http://localhost:3000')
client.query('Explain to me the difference between "further" and "farther"')

你还可以使用 openllm query 命令从终端查询模型：

export OPENLLM_ENDPOINT=http://localhost:3000
openllm query 'Explain to me the difference between "further" and "farther"'

访问 http://localhost:3000/docs.json 了解 OpenLLM 的 API 规范。

OpenLLM 无缝支持许多模型及其变体。用户还可以通过提供 --model-id 参数来指定要服务的模型的不同变体，例如：

openllm start flan-t5 --model-id google/flan-t5-large

注意：openllm 还支持微调权重的所有变体、自定义模型路径以及任何受支持模型的量化权重，只要它可以加载模型架构即可。有关模型架构，请参阅支持的模型部分。

使用 openllm models 命令查看 OpenLLM 支持的模型及其变体的列表。

3、OpenLLM支持的模型

OpenLLM 当前支持以下模型。默认情况下，OpenLLM 不包含运行所有模型的依赖项。可以按照以下说明安装额外的特定于模型的依赖项：

3.1 Chatglm

模型架构：ChatGLMForConditionalGeneration

模型ID：

thudm/chatglm-6b
thudm/chatglm-6b-int8
thudm/chatglm-6b-int4
thudm/chatglm2-6b
thudm/chatglm2-6b-int4

安装方法：

pip install "openllm[chatglm]"

3.2 dolly-v2

模型架构：GPTNeoXForCausalLM

模型ID：

databricks/dolly-v2-3b
databricks/dolly-v2-7b
databricks/dolly-v2-12b

安装方法：

pip install openllm

3.3 falcon

模型架构：FalconForCausalLM

模型ID：

tiiuae/falcon-7b
tiiuae/falcon-40b
tiiuae/falcon-7b-instruct
tiiuae/falcon-40b-instruct

安装方法：

pip install "openllm[falcon]"

3.4 flan-t5

模型架构：T5ForConditionalGeneration

模型ID：

google/flan-t5-small
google/flan-t5-base
google/flan-t5-large
google/flan-t5-xl
google/flan-t5-xxl

安装方法：

pip install "openllm[flan-t5]"

3.5 gpt-neox

模型架构：GPTNeoXForCausalLM

模型ID：

eleutherai/gpt-neox-20b

安装方法：

pip install openllm

3.6 llama

模型架构：LlamaForCausalLM

模型ID：

meta-llama/Llama-2-70b-chat-hf
meta-llama/Llama-2-13b-chat-hf
meta-llama/Llama-2-7b-chat-hf
meta-llama/Llama-2-70b-hf
meta-llama/Llama-2-13b-hf
meta-llama/Llama-2-7b-hf
NousResearch/llama-2-70b-chat-hf
NousResearch/llama-2-13b-chat-hf
NousResearch/llama-2-7b-chat-hf
NousResearch/llama-2-70b-hf
NousResearch/llama-2-13b-hf
NousResearch/llama-2-7b-hf
openlm-research/open_llama_7b_v2
openlm-research/open_llama_3b_v2
openlm-research/open_llama_13b
huggyllama/llama-65b
huggyllama/llama-30b
huggyllama/llama-13b
huggyllama/llama-7b

安装方法：

pip install "openllm[llama]"

3.7 mpt

模型架构：MPTForCausalLM

模型ID：

mosaicml/mpt-7b
mosaicml/mpt-7b-instruct
mosaicml/mpt-7b-chat
mosaicml/mpt-7b-storywriter
mosaicml/mpt-30b
mosaicml/mpt-30b-instruct
mosaicml/mpt-30b-chat

安装方法：

pip install "openllm[mpt]"

3.8 opt

模型架构：OPTForCausalLM

模型ID：

facebook/opt-125m
facebook/opt-350m
facebook/opt-1.3b
facebook/opt-2.7b
facebook/opt-6.7b
facebook/opt-66b

安装方法：

pip install "openllm[opt]"

3.9 stablelm

模型架构：GPTNeoXForCausalLM

模型ID：

stabilityai/stablelm-tuned-alpha-3b
stabilityai/stablelm-tuned-alpha-7b
stabilityai/stablelm-base-alpha-3b
stabilityai/stablelm-base-alpha-7b

安装方法：

pip install openllm

3.10 starcoder

模型架构：GPTBigCodeForCausalLM

模型ID：

bigcode/starcoder
bigcode/starcoderbase

安装方法：

pip install "openllm[starcoder]"

3.11 baichuan

模型架构：BaiChuanForCausalLM

m模型ID：

baichuan-inc/baichuan-7b
baichuan-inc/baichuan-13b-base
baichuan-inc/baichuan-13b-chat
fireballoon/baichuan-vicuna-chinese-7b
fireballoon/baichuan-vicuna-7b
hiyouga/baichuan-7b-sft

安装方法：

pip install "openllm[baichuan]"

原文链接：Operating LLMs in production - OpenLLM

BimAnt翻译整理，转载请标明出处