安全代码执行

如果您是 Agent 构建新手，请务必首先阅读 Agent 介绍和 smolagents 引导教程。

代码 Agent

多篇研究论文表明，让 LLM 用代码编写其操作（工具调用）比当前行业标准的工具调用格式好得多，后者在整个行业中都是“将操作编写为工具名称和要使用的参数的 JSON”。

为什么代码更好？因为我们专门设计了代码语言，使其非常擅长表达计算机执行的操作。如果 JSON 代码段是更好的方式，那么这个软件包就会用 JSON 代码段编写，魔鬼会嘲笑我们。

代码只是在计算机上表达操作的更好方式。它具有更好的

组合性： 你能将 JSON 操作相互嵌套，或者定义一组 JSON 操作以便稍后重用，就像你可以定义一个 python 函数一样吗？
对象管理： 你如何在 JSON 中存储像 generate_image 这样的操作的输出？
通用性： 代码旨在简单地表达你可以让计算机做的任何事情。
在 LLM 训练语料库中的表示： 为什么不利用上天的恩赐，即大量的优质操作已经包含在 LLM 训练语料库中呢？

这在下图中有说明，取自《可执行代码操作引发更好的 LLM Agent》。

这就是为什么我们强调提出代码 Agent，在这种情况下是 python Agent，这意味着要付出更多努力来构建安全的 python 解释器。

本地代码执行？

预设情况下，CodeAgent 会在您的环境中运行 LLM 生成的代码。

这本身就存在风险，LLM 生成的代码可能对您的环境有害。

恶意代码执行可能以多种方式发生

纯粹的 LLM 错误： LLM 仍然远非完美，并且可能在试图提供帮助时无意中生成有害命令。虽然这种风险很低，但已经观察到 LLM 尝试执行潜在危险代码的实例。
供应链攻击： 运行不受信任或受损的 LLM 可能会使系统暴露于有害代码生成。虽然在使用安全推理基础设施上的知名模型时，这种风险极低，但它仍然是一种理论上的可能性。
Prompt 注入： 浏览网络的 Agent 可能会访问包含有害指令的恶意网站，从而将攻击注入到 Agent 的内存中
公开访问 Agent 的利用： 暴露于公众的 Agent 可能会被恶意行为者滥用以执行有害代码。攻击者可能会精心制作对抗性输入，以利用 Agent 的执行能力，从而导致意想不到的后果。一旦恶意代码被执行，无论是意外还是故意的，都可能损坏文件系统、利用本地或云端资源、滥用 API 服务，甚至危及网络安全。

有人可能会说，在 Agent 性的频谱上，代码 Agent 比其他 Agent 性较低的设置赋予您系统上的 LLM 更高的 Agent 性：这与更高的风险息息相关。

因此，您需要非常注意安全。

为了提高安全性，我们提出了一系列措施，这些措施以更高的设置成本为代价，提出了更高的安全级别。

我们建议您记住，没有解决方案是 100% 安全的。

我们的本地 Python 执行器

为了增加第一层安全性，smolagents 中的代码执行不是由原始的 Python 解释器执行的。我们从头开始重建了一个更安全的 LocalPythonExecutor。

准确地说，这个解释器的工作原理是从您的代码加载抽象语法树 (AST) 并逐个操作地执行它，确保始终遵循某些规则

预设情况下，除非用户已将其明确添加到授权列表，否则不允许导入。
请注意，某些看似无害的软件包（如 random）可能会提供对潜在有害子模块的访问权限，例如 random._os。
处理的基本操作总数受到限制，以防止无限循环和资源膨胀。
任何在我们自定义解释器中未明确定义的操作都会引发错误。

您可以尝试以下安全措施

from smolagents.local_python_executor import LocalPythonExecutor

# Set up custom executor, authorize package "numpy"
custom_executor = LocalPythonExecutor(["numpy"])

# Utilisty for pretty printing errors
def run_capture_exception(command: str):
    try:
        custom_executor(harmful_command)
    except Exception as e:
        print("ERROR:\n", e)

# Undefined command just do not work
harmful_command="!echo Bad command"
run_capture_exception(harmful_command)
# >>> ERROR: invalid syntax (<unknown>, line 1)


# Imports like os will not be performed unless explicitly added to `additional_authorized_imports`
harmful_command="import os; exit_code = os.system("echo Bad command")"
run_capture_exception(harmful_command)
# >>> ERROR: Code execution failed at line 'import os' due to: InterpreterError: Import of os is not allowed. Authorized imports are: ['statistics', 'numpy', 'itertools', 'time', 'queue', 'collections', 'math', 'random', 're', 'datetime', 'stat', 'unicodedata']

# Even in authorized imports, potentially harmful packages will not be imported
harmful_command="import random; random._os.system('echo Bad command')"
run_capture_exception(harmful_command)
# >>> ERROR: Code execution failed at line 'random._os.system('echo Bad command')' due to: InterpreterError: Forbidden access to module: os

# Infinite loop are interrupted after N operations
harmful_command="""
while True:
    pass
"""
run_capture_exception(harmful_command)
# >>> ERROR: Code execution failed at line 'while True: pass' due to: InterpreterError: Maximum number of 1000000 iterations in While loop exceeded

这些安全措施使我们的解释器更安全。我们已在各种用例中使用它，从未观察到对环境的任何损害。

然而，这个解决方案肯定不是万无一失的，因为没有本地 python 沙箱可以真正做到这一点：人们可以想象，针对恶意行为微调的 LLM 仍然可能损害您的环境。

例如，如果您允许像 Pillow 这样无害的软件包处理图像，LLM 可能会生成数千个图像保存来膨胀您的硬盘驱动器。

custom_executor = LocalPythonExecutor(["PIL"])

harmful_command="""
from PIL import Image

img = Image.new('RGB', (100, 100), color='blue')

i=0
while i < 10000:
    img.save('simple_image_{i}.png')
    i += 1
"""
# Let's not execute this but it would not error out, and it would bloat your system with images.

其他攻击示例可以在这里找到。

运行这些有针对性的恶意代码片段需要供应链攻击，这意味着您使用的 LLM 已被污染。

当使用来自受信任推理提供商的知名 LLM 时，这种情况发生的可能性很低，但仍然非零。

安全运行 LLM 生成代码的唯一方法是将执行与您的本地环境隔离。

因此，如果您想谨慎行事，则应使用远程执行沙箱。

这里是如何做到这一点的示例。

用于安全代码执行的沙箱设置

当使用执行代码的 AI Agent 时，安全性至关重要。本指南描述了如何使用 E2B 云沙箱或本地 Docker 容器为您的 Agent 应用程序设置和使用安全沙箱。

E2B 设置

安装

在 e2b.dev 创建一个 E2B 帐户
安装所需的软件包

pip install 'smolagents[e2b]'

在 E2B 中运行您的 Agent：快速入门

我们提供了一种使用 E2B 沙箱的简单方法：只需将 executor_type="e2b" 添加到 Agent 初始化中，如下所示

from smolagents import HfApiModel, CodeAgent

agent = CodeAgent(model=HfApiModel(), tools=[], executor_type="e2b")

agent.run("Can you give me the 100th Fibonacci number?")

此解决方案在每次 agent.run() 开始时将 Agent 状态发送到服务器。然后从本地环境调用模型，但生成的代码将发送到沙箱执行，并且仅返回输出。

这在下图中有说明。

sandboxed code execution

但是，由于对托管 Agent 的任何调用都需要模型调用，由于我们不将密钥传输到远程沙箱，因此模型调用将缺少凭据。因此，此解决方案（尚未）不适用于更复杂的多 Agent 设置。

在 E2B 中运行您的 Agent：多 Agent

要在 E2B 沙箱中使用多 Agent，您需要完全从 E2B 内部运行您的 Agent。

这是操作方法

from e2b_code_interpreter import Sandbox
import os

# Create the sandbox
sandbox = Sandbox()

# Install required packages
sandbox.commands.run("pip install smolagents")

def run_code_raise_errors(sandbox, code: str, verbose: bool = False) -> str:
    execution = sandbox.run_code(
        code,
        envs={'HF_TOKEN': os.getenv('HF_TOKEN')}
    )
    if execution.error:
        execution_logs = "\n".join([str(log) for log in execution.logs.stdout])
        logs = execution_logs
        logs += execution.error.traceback
        raise ValueError(logs)
    return "\n".join([str(log) for log in execution.logs.stdout])

# Define your agent application
agent_code = """
import os
from smolagents import CodeAgent, HfApiModel

# Initialize the agents
agent = CodeAgent(
    model=HfApiModel(token=os.getenv("HF_TOKEN"), provider="together"),
    tools=[],
    name="coder_agent",
    description="This agent takes care of your difficult algorithmic problems using code."
)

manager_agent = CodeAgent(
    model=HfApiModel(token=os.getenv("HF_TOKEN"), provider="together"),
    tools=[],
    managed_agents=[agent],
)

# Run the agent
response = manager_agent.run("What's the 20th Fibonacci number?")
print(response)
"""

# Run the agent code in the sandbox
execution_logs = run_code_raise_errors(sandbox, agent_code)
print(execution_logs)

Docker 设置

安装

在您的系统上安装 Docker
安装所需的软件包

pip install 'smolagents[docker]'

在 E2B 中运行您的 Agent：快速入门

与上面的 E2B 沙箱类似，要快速开始使用 Docker，只需将 executor_type="docker" 添加到 Agent 初始化中，如下所示

from smolagents import HfApiModel, CodeAgent

agent = CodeAgent(model=HfApiModel(), tools=[], executor_type="docker")

agent.run("Can you give me the 100th Fibonacci number?")

高级 Docker 用法

如果您想在 Docker 中运行多 Agent 系统，则需要在沙箱中设置自定义解释器。

这是如何设置 Dockerfile

FROM python:3.10-bullseye

# Install build dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        build-essential \
        python3-dev && \
    pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir smolagents && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Run with limited privileges
USER nobody

# Default command
CMD ["python", "-c", "print('Container ready')"]

创建沙箱管理器以运行代码

import docker
import os
from typing import Optional

class DockerSandbox:
    def __init__(self):
        self.client = docker.from_env()
        self.container = None

    def create_container(self):
        try:
            image, build_logs = self.client.images.build(
                path=".",
                tag="agent-sandbox",
                rm=True,
                forcerm=True,
                buildargs={},
                # decode=True
            )
        except docker.errors.BuildError as e:
            print("Build error logs:")
            for log in e.build_log:
                if 'stream' in log:
                    print(log['stream'].strip())
            raise

        # Create container with security constraints and proper logging
        self.container = self.client.containers.run(
            "agent-sandbox",
            command="tail -f /dev/null",  # Keep container running
            detach=True,
            tty=True,
            mem_limit="512m",
            cpu_quota=50000,
            pids_limit=100,
            security_opt=["no-new-privileges"],
            cap_drop=["ALL"],
            environment={
                "HF_TOKEN": os.getenv("HF_TOKEN")
            },
        )

    def run_code(self, code: str) -> Optional[str]:
        if not self.container:
            self.create_container()

        # Execute code in container
        exec_result = self.container.exec_run(
            cmd=["python", "-c", code],
            user="nobody"
        )

        # Collect all output
        return exec_result.output.decode() if exec_result.output else None


    def cleanup(self):
        if self.container:
            try:
                self.container.stop()
            except docker.errors.NotFound:
                # Container already removed, this is expected
                pass
            except Exception as e:
                print(f"Error during cleanup: {e}")
            finally:
                self.container = None  # Clear the reference

# Example usage:
sandbox = DockerSandbox()

try:
    # Define your agent code
    agent_code = """
import os
from smolagents import CodeAgent, HfApiModel

# Initialize the agent
agent = CodeAgent(
    model=HfApiModel(token=os.getenv("HF_TOKEN"), provider="together"),
    tools=[]
)

# Run the agent
response = agent.run("What's the 20th Fibonacci number?")
print(response)
"""

    # Run the code in the sandbox
    output = sandbox.run_code(agent_code)
    print(output)

finally:
    sandbox.cleanup()

沙箱最佳实践

这些关键实践适用于 E2B 和 Docker 沙箱

资源管理
- 设置内存和 CPU 限制
- 实施执行超时
- 监控资源使用情况
安全性
- 以最小权限运行
- 禁用不必要的网络访问
- 使用环境变量存储密钥
环境
- 保持依赖项最小化
- 使用固定软件包版本
- 如果您使用基础镜像，请定期更新它们
清理
- 始终确保正确清理资源，特别是对于 Docker 容器，以避免悬空容器占用资源。

✨ 通过遵循这些实践并实施正确的清理程序，您可以确保您的 Agent 在沙箱环境中安全高效地运行。

< > 在 GitHub 上更新