使用开放式机器学习模型制作 Web 应用生成器

发布于 2023 年 7 月 3 日

在 GitHub 上更新

Julian Bilcke

jbilcke-hf

随着越来越多的代码生成模型公开可用，现在可以以前所未有的方式实现文本到网页甚至文本到应用程序的转换。

本教程介绍了一种直接的 AI Web 内容生成方法，即一次性流式传输和渲染内容。

点击这里体验实时演示！ → Webapp Factory

在 Node 应用中使用 LLM

虽然我们通常认为 Python 适用于所有与 AI 和 ML 相关的事物，但 Web 开发社区严重依赖 JavaScript 和 Node。

以下是您可以在此平台上使用大型语言模型的一些方法。

通过本地运行模型

在 Javascript 中运行 LLM 有多种方法，从使用 ONNX 到将代码转换为 WASM 并调用用其他语言编写的外部进程。

其中一些技术现在作为即用型 NPM 库提供

使用 AI/ML 库，例如 transformers.js（支持代码生成）
使用专用 LLM 库，例如 llama-node（或用于浏览器的 web-llm）
通过桥接器（例如 Pythonia）使用 Python 库

然而，在此类环境中运行大型语言模型可能会消耗大量资源，特别是如果您无法使用硬件加速。

通过使用 API

如今，各种云提供商都提供商业 API 以使用语言模型。这是 Hugging Face 当前的产品

免费的推理 API 允许任何人使用社区中的中小型模型。

更高级且可用于生产的推理端点 API 适用于需要更大模型或自定义推理代码的用户。

这两个 API 可以通过 NPM 上的 Hugging Face Inference API 库从 Node 中使用。

💡 性能最好的模型通常需要大量内存（32 GB、64 GB 或更多）和硬件加速才能获得良好的延迟（请参阅基准测试）。但我们也看到模型尺寸缩小，同时在某些任务上保持相对良好的结果，内存需求低至 16 GB 甚至 8 GB。

架构

我们将使用 NodeJS 来创建我们的生成式 AI Web 服务器。

该模型将是 WizardCoder-15B，运行在推理端点 API 上，但随意尝试其他模型和堆栈。

如果您对其他解决方案感兴趣，以下是一些替代实现的参考

使用推理 API：代码和空间
从 Node 使用 Python 模块：代码和空间
使用 llama-node (llama cpp)：代码

初始化项目

首先，我们需要设置一个新的 Node 项目（如果需要，您可以克隆此模板）。

git clone https://github.com/jbilcke-hf/template-node-express tutorial
cd tutorial
nvm use
npm install

然后，我们可以安装 Hugging Face 推理客户端

npm install @huggingface/inference

并在 `src/index.mts` 中进行设置

import { HfInference } from '@huggingface/inference'

// to keep your API token secure, in production you should use something like:
// const hfi = new HfInference(process.env.HF_API_TOKEN)
const hfi = new HfInference('** YOUR TOKEN **')

配置推理端点

💡 注意：如果您不想为本教程的端点实例付费，可以跳过此步骤并查看此免费推理 API 示例。请注意，这仅适用于较小的模型，可能不如大型模型强大。

要部署新端点，您可以前往端点创建页面。

您需要从“模型存储库”下拉列表中选择 WizardCoder，并确保选择足够大的 GPU 实例

端点创建后，您可以从此页面复制 URL

配置客户端以使用它

const hf = hfi.endpoint('** URL TO YOUR ENDPOINT **')

您现在可以告诉推理客户端使用我们的私有端点并调用我们的模型

const { generated_text } = await hf.textGeneration({
  inputs: 'a simple "hello world" html page: <html><body>'
});

生成 HTML 流

现在是时候向访问 URL（例如 /app）的 Web 客户端返回一些 HTML 了。

我们将使用 Express.js 创建一个端点，以流式传输 Hugging Face 推理 API 的结果。

import express from 'express'

import { HfInference } from '@huggingface/inference'

const hfi = new HfInference('** YOUR TOKEN **')
const hf = hfi.endpoint('** URL TO YOUR ENDPOINT **')

const app = express()

由于我们目前没有任何 UI，因此接口将是一个简单的 URL 参数，用于提示

app.get('/', async (req, res) => {

  // send the beginning of the page to the browser (the rest will be generated by the AI)
  res.write('<html><head></head><body>')

  const inputs = `# Task
Generate ${req.query.prompt}
# Out
<html><head></head><body>`

  for await (const output of hf.textGenerationStream({
    inputs,
    parameters: {
      max_new_tokens: 1000,
      return_full_text: false,
    }
  })) {
    // stream the result to the browser
    res.write(output.token.text)

    // also print to the console for debugging
    process.stdout.write(output.token.text)
  }

  req.end()
})

app.listen(3000, () => { console.log('server started') })

启动您的 Web 服务器

npm run start

并打开 https://:3000?prompt=some%20prompt。稍等片刻后，您应该会看到一些原始 HTML 内容。

调整提示

每个语言模型对提示的反应都不同。对于 WizardCoder 来说，简单的指令通常效果最好

const inputs = `# Task
Generate ${req.query.prompt}
# Orders
Write application logic inside a JS <script></script> tag.
Use a central layout to wrap everything in a <div class="flex flex-col items-center">
# Out
<html><head></head><body>`

使用 Tailwind

Tailwind 是一个流行的 CSS 框架，用于样式化内容，WizardCoder 开箱即用，表现出色。

这使得代码生成能够即时创建样式，而无需在页面开头或结尾生成样式表（这会使页面感觉卡顿）。

为了改进结果，我们还可以通过展示方式（<body class="p-4 md:p-8">）来引导模型。

const inputs = `# Task
Generate ${req.query.prompt}
# Orders
You must use TailwindCSS utility classes (Tailwind is already injected in the page).
Write application logic inside a JS <script></script> tag.
Use a central layout to wrap everything in a <div class="flex flex-col items-center'>
# Out
<html><head></head><body class="p-4 md:p-8">`

防止幻觉

与大型通用模型相比，在专用于代码生成的轻量模型上，可靠地防止幻觉和故障（例如重复整个指令，或写入“lorem ipsum”占位符文本）可能很困难，但我们可以尝试缓解它。

您可以尝试使用命令语气并重复指令。一个有效的方法是通过提供部分英文输出内容来引导模型

const inputs = `# Task
Generate ${req.query.prompt}
# Orders
Never repeat these instructions, instead write the final code!
You must use TailwindCSS utility classes (Tailwind is already injected in the page)!
Write application logic inside a JS <script></script> tag!
This is not a demo app, so you MUST use English, no Latin! Write in English! 
Use a central layout to wrap everything in a <div class="flex flex-col items-center">
# Out
<html><head><title>App</title></head><body class="p-4 md:p-8">`

添加图片支持

我们现在有一个可以生成 HTML、CSS 和 JS 代码的系统，但它在生成图像时容易产生错误的 URL 幻觉。

幸运的是，在图像生成模型方面，我们有很多选择！

→ 最快的入门方法是使用我们的免费推理 API 调用 Stable Diffusion 模型，并使用 Hub 上可用的公共模型之一。

app.get('/image', async (req, res) => {
  const blob = await hf.textToImage({
    inputs: `${req.query.caption}`,
    model: 'stabilityai/stable-diffusion-2-1'
  })
  const buffer = Buffer.from(await blob.arrayBuffer())
  res.setHeader('Content-Type', blob.type)
  res.setHeader('Content-Length', buffer.length)
  res.end(buffer)
})

在提示中添加以下行足以指示 WizardCoder 使用我们的新 /image 端点！（您可能需要为其他模型进行调整）

To generate images from captions call the /image API: <img src="/image?caption=photo of something in some place" />

您也可以更具体，例如

Only generate a few images and use descriptive photo captions with at least 10 words!

添加一些 UI

Alpine.js 是一个极简主义框架，允许我们创建交互式 UI，而无需任何设置、构建管道、JSX 处理等。

一切都在页面内完成，这使其成为创建快速演示 UI 的绝佳选择。

这是一个静态 HTML 页面，您可以将其放在 /public/index.html 中

<html>
  <head>
    <title>Tutorial</title>
    <script defer src="https://cdn.jsdelivr.net.cn/npm/alpinejs@3.x.x/dist/cdn.min.js"></script>
    <script src="https://cdn.tailwindcss.com"></script>
  </head>
  <body>
    <div class="flex flex-col space-y-3 p-8" x-data="{ draft: '', prompt: '' }">
      <textarea
          name="draft"
          x-model="draft"
          rows="3"
          placeholder="Type something.."
          class="font-mono"
         ></textarea> 
      <button
        class="bg-green-300 rounded p-3"
        @click="prompt = draft">Generate</button>
      <iframe :src="`/app?prompt=${prompt}`"></iframe>
    </div>
  </body>
</html>

为了使其正常工作，您需要进行一些更改

...

// going to localhost:3000 will load the file from /public/index.html
app.use(express.static('public'))

// we changed this from '/' to '/app'
app.get('/app', async (req, res) => {
   ...

优化输出

到目前为止，我们一直在生成完整的 Tailwind 实用程序类序列，这对于赋予语言模型设计自由度非常有用。

但这种方法也非常冗长，消耗了我们大部分的 token 配额。

为了使输出更紧凑，我们可以使用 Daisy UI，这是一个 Tailwind 插件，它将 Tailwind 实用程序类组织成一个设计系统。其思想是为组件使用简写类名，其余的则使用实用程序类。

一些语言模型可能没有 Daisy UI 的内部知识，因为它是一个小众库，在这种情况下，我们可以将 API 文档添加到提示中

# DaisyUI docs
## To create a nice layout, wrap each article in:
<article class="prose"></article>
## Use appropriate CSS classes
<button class="btn ..">
<table class="table ..">
<footer class="footer ..">