Please wait a minute...
Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184

Frontiers of Information Technology & Electronic Engineering  2024, Vol. 25 Issue (1): 135-148   https://doi.org/10.1631/FITEE.2300303
  本期目录
基于因果表征学习的可控图像生成
黄珊珊1(), 王元浩1, 龚志黎1, 廖军1, 王姝2, 刘礼1()
1. 重庆大学大数据与软件学院, 中国重庆市, 401331
2. 西南大学材料与能源学院, 中国重庆市, 400715
Controllable image generation based on causal representation learning
Shanshan HUANG1(), Yuanhao WANG1, Zhili GONG1, Jun LIAO1, Shu WANG2, Li LIU1()
1. School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China
2. School of Materials and Energy, Southwest University, Chongqing 400715, China
 全文: PDF(12027 KB)  
摘要:

人工智能生成内容(AIGC)已成为制作各种形式的大规模内容不可或缺的工具,特别是在图像生成和编辑中发挥重要作用。然而,图像生成和编辑的可解释性和可控性仍然是一个挑战。现有人工智能方法由于忽略图像内部的因果关系,往往难以生成既灵活又可控的图像。为解决这个问题,本文开发了一种新颖的因果可控图像生成方法,它将因果表征学习与双向生成对抗网络相结合。本文方法的关键在于使用因果结构学习模块学习图像属性之间的因果关系,并与图像生成模块中的编码器、生成器和联合鉴别器进行联合优化。基于这种方法,不仅可以学习图像潜在空间中的因果表征,进而实现因果可控的图像编辑,还可以利用因果干预操作生成反事实图像。最后,在真实世界的数据集CelebA上进行大量实验。实验结果证明所提方法的合理性和有效性。

Abstract

Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existing AI methods often face challenges in producing images that are both flexible and controllable while considering causal relationships within the images. To address this issue, we have developed a novel method for causal controllable image generation (CCIG) that combines causal representation learning with bi-directional generative adversarial networks (GANs). This approach enables humans to control image attributes while considering the rationality and interpretability of the generated images and also allows for the generation of counterfactual images. The key of our approach, CCIG, lies in the use of a causal structure learning module to learn the causal relationships between image attributes and joint optimization with the encoder, generator, and joint discriminator in the image generation module. By doing so, we can learn causal representations in image’s latent space and use causal intervention operations to control image generation. We conduct extensive experiments on a real-world dataset, CelebA. The experimental results illustrate the effectiveness of CCIG.

Key wordsImage generation    Controllable image editing    Causal structure learning    Causal representation learning
收稿日期: 2023-05-05      出版日期: 2024-02-07
通讯作者: 刘礼     E-mail: shanshanhuang@cqu.edu.cn;dcsliuli@cqu.edu.cn
Corresponding Author(s): Li LIU   
 引用本文:   
黄珊珊, 王元浩, 龚志黎, 廖军, 王姝, 刘礼. 基于因果表征学习的可控图像生成[J]. Frontiers of Information Technology & Electronic Engineering, 2024, 25(1): 135-148.
Shanshan HUANG, Yuanhao WANG, Zhili GONG, Jun LIAO, Shu WANG, Li LIU. Controllable image generation based on causal representation learning. Front. Inform. Technol. Electron. Eng, 2024, 25(1): 135-148.
 链接本文:  
https://academic.hep.com.cn/fitee/CN/10.1631/FITEE.2300303
https://academic.hep.com.cn/fitee/CN/Y2024/V25/I1/135
[1] FITEE-0135-24010-SSH_suppl_1 Download
[2] FITEE-0135-24010-SSH_suppl_2 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed