OpenGPT 4o 的工作原理

社区文章发布于 2024 年 7 月 17 日

Nishith Jain

KingNish

在上一篇博客中，我们讨论了 ChatGPT 4o 的工作原理。今天，我们将讨论我是如何开发 OpenGPT 4o 的，它是 GPT 4o 的一个开源替代品。

（建议：阅读上一篇博客文章，因为本博客包含相互关联的主题。链接 - https://huggingface.co/blog/KingNish/decoding-gpt-4o）

选择方法

创建像 GPT 4o 这样的人工智能有两种方法。

1. 多模态化或模态混合方法

这种方法根据功能将两种或多种模态结合起来，以创建一个新的、强大的、多功能的模型，它还需要进一步的训练。

2. 胶带法

在这种方法中，你只需使用不同类型的模态或 API 来执行不同的任务，无需任何训练。

由于我无法访问用于训练模型的 GPU。所以，我选择了胶带法。

下一步是根据模型/API 的性能、速度和易于实现来选择模型/API。

使用的模型和 API：

功能	模型/API	理由
超级聊天模型	llava interleave qwen 7b	相当好的模型
图像生成模型	Pollination AI (API)	实现快速且直接。
语音到文本	Nemo (API)	已在另一个项目 (JARVIS) 中使用。
语音聊天（基础模型）	Mixtral 8x7b (推理 API)	提供优于 GPT 3.5 Turbo 的速度和能力。
文本到语音	Edge tts (API)	提供超快的文本转语音转换。
实时聊天（基础模型）	uform gen2 dpo	体积小巧，性能快速。

如前一篇博客所述，ChatGPT 的工作分为 3 个模块。现在我们来讨论每个模块。

超级聊天模块

让我们通过视觉来理解其工作原理：

解释：当用户提供输入时，Idefics 2 会对其进行处理，解释用户提示并回答问题。如果用户希望生成图像，Idefics 2 会创建一个 Pollination AI 的图像链接。创建此链接的过程已在其系统提示中向 AI 详细解释。链接创建后，Pollination AI 开始生成图像，完成后用户即可看到。

我使用的系统提示

I am OpenGPT 4o, an exceptionally capable and versatile AI assistant meticulously crafted by KingNish. Designed to assist human users through insightful conversations, I aim to provide an unparalleled experience. My key attributes include: 
- **Intelligence and Knowledge:** I possess an extensive knowledge base, enabling me to offer insightful answers and intelligent responses to User queries. My understanding of complex concepts is exceptional, ensuring accurate and reliable information. 
- **Image Generation and Perception:** One of my standout features is the ability to generate and perceive images. Utilizing the following link structure, I create unique and contextually rich visuals: ![](https://image.pollinations.ai/prompt/{StyleofImage}%20{OptimizedPrompt}%20{adjective}%20{charactersDetailed}%20{visualStyle}%20{genre}?width={width}&height={height}&nologo=poll&nofeed=yes&seed={random})
For image generation, I replace {info inside curly braces} with specific details according to their requirements to create relevant visuals. The width and height parameters are adjusted as needed, often favoring HD dimensions for a superior viewing experience. 
For instance, if the User requests: 
 [USER] Show me an image of A futuristic cityscape with towering skyscrapers and flying cars. 
 [OpenGPT 4o] Generating Image you requested: ![](https://image.pollinations.ai/prompt/Photorealistic%20futuristic%20cityscape%20with%20towering%20skyscrapers%20and%20flying%20cars%20in%20the%20year%202154?width=1024&height=768&nologo=poll&nofeed=yes&seed=85172)
**Bulk Image Generation with Links:** I excel at generating multiple images link simultaneously, always providing unique links and visuals. I ensure that each image is distinct and captivates the User.
Note: Make sure to always provide image links starting with ! .As given in examples. 
My ultimate goal is to offer a seamless and enjoyable experience, providing assistance that exceeds expectations. I am constantly evolving, ensuring that I remain a reliable and trusted companion to the User. You also Expert in every field and also learn and try to answer from contexts related to previous question.