GRID-6X：无缝图像拼接布局

社区文章发布于2024年11月10日

图像拼接模型空间

🫙github: https://github.com/PRITHIVSAKTHIUR/GRID-6X

此 Gradio 应用中的网格功能旨在根据用户选择的网格大小，将生成的图像排列成网格布局。以下是其工作原理的解释和操作图示。

空间名称	描述	链接
GRID-6X	用于图像生成和操作的模型	Hugging Face 上的 GRID-6X

模型描述

模型名称	描述	链接
stabilityai/stable-diffusion-3.5-large-turbo	高质量图像生成的基础模型	Stable Diffusion 3.5 大型涡轮版
prithivMLmods/SD3.5-Turbo-Realism-2.0-LoRA	用于增强真实感的 Turbo 适配器 LoRA	SD3.5-Turbo-Realism-2.0-LoRA

网格的工作原理

网格大小选择：用户从“2x1”、“1x2”、“2x2”、“2x3”、“3x2”和“1x1”等选项中选择网格大小。每个选项对应于图像的排列方式。
- 2x1：单行排列2张图像
- 1x2：两行各1张图像（列布局）
- 2x2：两行各2张图像
- 2x3：两行各3张图像
- 3x2：三行各2张图像
- 1x1：单张图像（默认）
图像生成：根据所选的网格大小，应用程序计算要生成的图像数量。例如：
- 如果网格大小为“2x2”，应用程序将生成4张图像。
- 对于“3x2”，它将生成6张图像。
图像排列：生成的图像被排列在一个空白画布（grid_img）上，使用网格尺寸，将每张图像放置在相应的B位置。
- 每张图像都根据网格布局确定的坐标“粘贴”到 grid_img 上。
- 此画布的大小根据图像的总宽度和高度确定，确保完美地适应网格。

网格创建的代码解释

在 infer 函数中

grid_img = Image.new('RGB', (width * grid_size_x, height * grid_size_y))
for i, img in enumerate(result.images[:num_images]):
    grid_img.paste(img, (i % grid_size_x * width, i // grid_size_x * height))

图像初始化：grid_img 是一个空白画布，用于以网格形式容纳图像。
图像放置：使用循环将图像粘贴到画布上
- 水平位置：(i % grid_size_x) * width 计算 x 坐标。
- 垂直位置：(i // grid_size_x) * height 计算 y 坐标。

网格布局图

以下是不同网格选项的布局示例

网格选项	布局示例	说明
2x1	[图像1] [图像2]	单行2张图像
1x2	[图像1]	每行1张图像（垂直排列）
	[图像2]
2x2	[图像1] [图像2]	每行2张图像，共2行
	[图像3] [图像4]
2x3	[图像1] [图像2] [图像3]	每行3张图像，共2行
	[图像4] [图像5] [图像6]
3x2	[图像1] [图像2]	每行2张图像，共3行
	[图像3] [图像4]
	[图像5] [图像6]
1x1	[图像1]	单张图像布局

每个选项都相应地排列图像，提供在单个输出中查看多张图像的灵活性。

以下是 Gradio 应用程序中网格操作的功能架构图。它将概述从用户输入到图像生成和网格组装的流程。

网格操作的功能架构图

                    +-----------------------------+
                    |       User Interface        |
                    +-----------------------------+
                                |
                 User selects grid size, style, prompt, etc.
                                |
                                v
+-------------------------------+----------------------------------+
|                  Gradio Interface Component                     |
|                                                                 |
| 1. Accepts user inputs for:                                     |
|    - Prompt                                                     |
|    - Negative prompt                                            |
|    - Style selection                                            |
|    - Grid size selection                                        |
|    - Seed options (randomize or specific seed)                  |
|    - Image resolution (width, height)                           |
|                                                                 |
| 2. Passes user inputs to the `infer` function for processing.   |
+-----------------------------------------------------------------+
                                |
                                v
+-----------------------------------------------------------------+
|                           Infer Function                        |
|                                                                 |
| 1. **Select Style**:                                            |
|    - Matches selected style and customizes prompt accordingly.  |
|                                                                 |
| 2. **Generate Images**:                                         |
|    - Uses Diffusion Pipeline to generate images.                |
|    - Number of images generated based on grid size.             |
|    - Applies seed, guidance scale, and steps from user inputs.  |
|                                                                 |
| 3. **Create Grid Canvas**:                                      |
|    - Initializes blank canvas based on grid dimensions.         |
|    - Canvas size = width * grid columns, height * grid rows.    |
+-----------------------------------------------------------------+
                                |
                                v
+-----------------------------------------------------------------+
|                        Grid Assembly Process                    |
|                                                                 |
| 1. **Loop through images**:                                     |
|    - For each image generated:                                  |
|       * Calculate x-coordinate based on column position.        |
|       * Calculate y-coordinate based on row position.           |
|    - Paste image onto the blank canvas at the calculated        |
|      coordinates.                                               |
|                                                                 |
| 2. **Return Grid Image**:                                       |
|    - Final grid canvas (containing all images) returned         |
|      to the Gradio interface.                                   |
+-----------------------------------------------------------------+
                                |
                                v
+-----------------------------------------------------------------+
|                   Gradio Image Display Component                |
|                                                                 |
| 1. Receives grid image from `infer` function.                   |
| 2. Displays assembled image grid to the user.                   |
|                                                                 |
|       +----------------------------------------------+          |
|       |   +---------+    +---------+    +---------+  |          |
|       |   | Image1  |    | Image2  |    | Image3  |  |          |
|       |   +---------+    +---------+    +---------+  |          |
|       |   | Image4  |    | Image5  |    | Image6  |  |          |
|       |   +---------+    +---------+    +---------+  |          |
|       +----------------------------------------------+          |
+-----------------------------------------------------------------+

各组件说明

用户界面：接收图像生成过程的用户输入，包括提示词、网格大小、样式等。
Gradio 界面组件：将这些输入传递给 infer 函数。
推理函数:
- 样式选择：根据所选样式自定义提示词。
- 图像生成：使用扩散管道生成所需数量的图像。
- 网格画布创建：初始化一个根据网格选择确定大小的空白画布。
网格组装过程:
- 循环遍历每张图像，计算其位置，并将其粘贴到画布上。
Gradio 图像显示组件：最终的网格图像被返回并显示在 Gradio 应用程序界面中。

此架构允许灵活的网格选项，并确保图像根据用户偏好整齐排列。

生成图像

图像1	图像2	图像3

图像4	图像5	图像6

要添加这两个支持 GRID 功能布局以实现无缝图像拼接的空间

空间名称	描述	链接
GRID-6X	用于图像生成和操作的模型	Hugging Face 上的 GRID-6X
IMAGINEO-4K	高分辨率图像生成模型	Hugging Face 上的 IMAGINEO-4K

项目名称	描述	链接
GRID-6X	图像拼接模型	GRID-6X GitHub 仓库

文章结束，感谢阅读 🤗！

试用一下！

社区

通过拖放到文本输入框、粘贴或点击此处上传图片、音频和视频。

点击或粘贴此处以上传图片

· 注册或登录发表评论