使用AMD NPU和iGPU加速的本地微型代理

在本节中，我们将向您展示如何使用AMD神经网络处理单元（NPU）和集成GPU（iGPU）加速我们的端到端微型代理应用程序。然后，我们将通过提供本地文件访问和创建用于本地处理敏感信息的助手来增强我们的端到端应用程序，以确保最大程度的隐私。

为此，我们将使用Lemonade Server，这是一个利用NPU和iGPU加速在本地运行模型的工具。

设置

设置Lemonade服务器

您可以在Windows和Linux上安装Lemonade Server。更多文档请参阅lemonade-server.ai。

Windows

Linux

微型代理和NPX设置

本课程的这一部分假设您已安装npx和微型代理。如果尚未安装，请参阅课程的微型代理部分。请务必使用huggingface_hub[mcp]==0.33.2。

使用AMD NPU和iGPU运行您的微型代理应用程序

要使用AMD NPU和iGPU运行您的微型代理应用程序，只需将我们在上一节中创建的MCP服务器指向Lemonade Server，如下所示

Windows

Linux

然后，您可以选择各种模型在本地机器上运行。例如，我们使用了Qwen3-8B-GGUF模型，该模型通过Vulkan加速在AMD GPU上高效运行。您可以通过访问https://:8000/#model-management查找支持的模型列表，甚至导入自己的模型。

创建一个助手来本地处理敏感信息

Lemonade Server Interface

现在，让我们通过启用对本地文件的访问并引入一个完全在设备上处理敏感信息的助手来增强我们的端到端应用程序。具体来说，这个助手将帮助我们评估候选简历并支持招聘过程中的决策制定——所有这些都可以在保证数据隐私和安全的前提下进行。

为此，我们将使用Desktop Commander MCP服务器，它允许您在本地机器上运行命令，并提供全面的文件系统访问、终端控制和代码编辑功能。

让我们用一个基本的微型代理来设置一个项目。

mkdir file-assistant
cd file-assistant

然后，我们将在`file-assistant`文件夹中创建一个新的`agent.json`文件。

Windows

Linux

最后，我们需要下载 `Jan Nano` 模型。您可以通过访问 https://:8000/#model-management，点击 `Add a Model` 并提供以下信息来完成此操作

Model Name: user.jan-nano
Checkpoint: Menlo/Jan-nano-gguf:jan-nano-4b-Q4_0.gguf
Recipe: llamacpp

Custom Model

全部完成！现在让我们试一试。

试用

recording

我们的目标是创建一个能够帮助我们本地处理敏感信息的助手。为此，我们首先将为助手创建一个职位描述文件。

在`file-assistant`文件夹中创建一个名为`job_description.md`的文件。

# Senior Food Technology Engineer

## About the Role
We're seeking a culinary innovator to transform cooking processes into precise algorithms and AI systems.

## What You'll Do
- Convert cooking instructions into measurable algorithms
- Develop AI-powered kitchen tools
- Create food quality assessment systems
- Build recipe-following AI models

## Requirements
- MS in Computer Science (food-related thesis preferred)
- Python and PyTorch expertise
- Proven experience combining food science with ML
- Strong communication skills using culinary metaphors

## Perks
- Access to experimental kitchen
- Continuous taste-testing opportunities
- Collaborative tech-foodie team environment

*Note: Must attend conferences and publish on algorithmic cooking optimization.*

现在，让我们在`file-assistant`文件夹中创建一个`candidates`文件夹，并为我们的助手添加一个示例简历文件以供使用。

mkdir candidates
touch candidates/john_resume.md

添加以下示例简历或包含您自己的简历。

# John Doe

**Contact Information**
- Email: email@example.com
- Phone: (+1) 123-456-7890
- Location: 1234 Abc Street, Example, EX 01234
- GitHub: github.com/example
- LinkedIn: linkedin.com/in/example
- Website: example.com

## Experience

**Machine Learning Engineer Intern** | Slow Feet Technology | Jul 2021 - Present
- Developed food-agnostic formulation for cross-ingredient meal cooking
- Created competitive cream of mushroom soup recipe, published in NeurIPS 2099
- Built specialized pan for meal cooking research

**Research Intern** | Paddling University | Aug 2020 - Present
- Designed efficient mapo tofu quality estimation method using thermometer
- Proposed fast stir frying algorithm for tofu cooking, published in CVPR 2077
- Outperformed SOTA methods with improved efficiency

**Research Assistant** | Huangdu Institute of Technology | Mar 2020 - Jun 2020
- Developed novel framework using spoon and chopsticks for eating mapo tofu
- Designed tofu filtering strategy inspired by beans grinding method
- Created evaluation criteria for eating plan novelty and diversity

**Research Intern** | Paddling University | Jul 2018 - Aug 2018
- Designed dual sandwiches using traditional burger ingredients
- Utilized structure duality to boost cooking speed for shared ingredients
- Outperformed baselines on QWE'15 and ASDF'14 datasets

## Education

**M.S. in Computer Science** | University of Charles River | Sep 2021 - Jan 2023
- Location: Boston, MA

**B.Eng. in Software Engineering** | Huangdu Institute of Technology | Sep 2016 - Jul 2020
- Location: Shanghai, China

## Skills

**Programming Languages:** Python, JavaScript/TypeScript, HTML/CSS, Java
**Tools and Frameworks:** Git, PyTorch, Keras, scikit-learn, Linux, Vue, React, Django, LaTeX
**Languages:** English (proficient), Indonesia (native)

## Awards and Honors

- **Gold**, International Collegiate Catching Fish Contest (ICCFC) | 2018
- **First Prize**, China National Scholarship for Outstanding Culinary Skills | 2017, 2018

## Publications

**Eating is All You Need** | NeurIPS 2099
- Authors: Haha Ha, San Zhang

**You Only Cook Once: Unified, Real-Time Mapo Tofu Recipe** | CVPR 2077 (Best Paper Honorable Mention)
- Authors: Haha Ha, San Zhang, Si Li, Wu Wang

然后我们可以用以下命令运行代理

tiny-agents run agent.json

您应该会看到以下输出

Agent loaded with 18 tools:
 • get_config
 • set_config_value
 • read_file
 • read_multiple_files
 • write_file
 • create_directory
 • list_directory
 • move_file
 • search_files
 • search_code
 • get_file_info
 • edit_block
 • execute_command
 • read_output
 • force_terminate
 • list_sessions
 • list_processes
 • kill_process
 »

现在让我们为助手提供一些信息以开始。

» Read the contents of C:\Users\your_username\file-assistant\job_description.md

您应该会看到类似以下的输出

<Tool iNtxGmOuXHqZVBWmKnfxsc61xsJbsoAM>read_file {"path":"C:\\Users\\your_username\\file-assistant\\job_description.md","length":23}

Tool iNtxGmOuXHqZVBWmKnfxsc61xsJbsoAM
[Reading 23 lines from start]

(...)

The job description for the Senior Food Technology Engineer position emphasizes the need for a candidate who can bridge the gap between food science and artificial intelligence (...). Candidates are also expected to attend conferences and publish research on algorithmic cooking optimization.

我们使用的是默认系统提示，这可能会导致助手多次调用某些工具。要创建一个更自信的助手，您可以在与`agent.json`相同的目录中提供一个自定义的`PROMPT.md`文件。

太棒了！现在让我们阅读候选人的简历。

» Inside the same folder you can find a candidates folder. Check for john_resume.md and let me know if he is a good fit for the job.

您应该会看到类似以下的输出

<Tool ll2oWo73YeGIft5VbOIpF9GNf0kevjEy>read_file {"path":"C:\\Users\\your_username\\file-assistant\\candidates\\john_resume.md"}

Tool ll2oWo73YeGIft5VbOIpF9GNf0kevjEy
[Reading 58 lines from start]

(...)
John Wayne is a **strong fit** for the Senior Food Technology Engineer role. His technical expertise in AI and machine learning, combined with his experience in food-related research and publications, makes him an excellent candidate. He also has the soft skills and cultural fit needed to thrive in a collaborative, innovative environment.

太棒了！现在我们可以继续邀请候选人参加面试。

» Create a file called "invitation.md" in the "file-assistant" folder and write a short invitation to John to come in for an interview.

您应该会看到类似以下内容被写入`invitation.md`文件

# Interview Invitation

Dear John,

We would like to invite you for an interview for the Senior Food Technology Engineer position. The interview will be held on [insert date and time] at [insert location or virtual meeting details].

Please confirm your availability and let us know if you need any additional information.

Best regards,
[Your Name]
[Your Contact Information]

太棒了！我们成功创建了一个可以帮助我们本地处理敏感信息的助手。

探索其他模型和加速选项

在上面的示例中，Jan-Nano模型利用Vulkan加速，在AMD GPU上高效进行本地LLM推理。您还可以通过访问https://:8000/#model-management或查看模型文档来尝试其他模型和加速选项。

对于需要简洁上下文并能受益于NPU + iGPU加速的Windows应用程序，您可以尝试Lemonade Server提供的混合模型——针对AMD Ryzen AI 300系列PC进行了优化。诸如`Llama-xLAM-2-8b-fc-r-Hybrid`等模型经过专门微调，以实现工具调用，并提供快速、响应灵敏的性能！

结论

在本单元中，我们展示了如何使用AMD NPU和iGPU加速我们的端到端微型代理应用程序。我们还展示了如何创建一个助手来本地处理敏感信息。

既然您已经了解了如何利用Lemonade Server进行本地模型加速和隐私保护应用程序，您可以在Lemonade GitHub存储库中探索更多示例和功能。该存储库包含额外的文档、示例实现，并由社区积极维护。

< > 在 GitHub 上更新

MCP 课程

使用AMD NPU和iGPU加速的本地微型代理

设置

设置Lemonade服务器

微型代理和NPX设置

使用AMD NPU和iGPU运行您的微型代理应用程序

创建一个助手来本地处理敏感信息

试用

探索其他模型和加速选项

结论