Generative adversarial network

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

A generative adversarial network (GAN) is a class of machine learning systems. Two neural networks contest with each other in a zero-sum game framework. This technique can generate photographs that look at least superficially authentic to human observers,[1][2] having many realistic characteristics. It is a form of unsupervised learning.[3]


The generative network generates candidates while the discriminative network evaluates them.[1][4][5][6] The contest operates in terms of data distributions. Typically, the generative network learns to map from a latent space to a data distribution of interest, while the discriminative network distinguishes candidates produced by the generator from the true data distribution. The generative network's training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel candidates that the discriminator thinks are not synthesized (are part of the true data distribution).[1][7]

A known dataset serves as the initial training data for the discriminator. Training it involves presenting it with samples from the training dataset, until it achieves acceptable accuracy. The generator trains based on whether it succeeds in fooling the discriminator. Typically the generator is seeded with randomized input that is sampled from a predefined latent space[4] (e.g. a multivariate normal distribution). Thereafter, candidates synthesized by the generator are evaluated by the discriminator. Backpropagation is applied in both networks[5] so that the generator produces better images, while the discriminator becomes more skilled at flagging synthetic images.[8] The generator is typically a deconvolutional neural network, and the discriminator is a convolutional neural network.


Unsupervised adversarial networks were proposed by Jürgen Schmidhuber in 1990. He called it artificial curiosity. An agent contains two artificial neural networks, Net1 and Net2. Net1 generates an output action that produces new data. Net2 tries to predict the data. It learns to minimize its error. At the same time, however, the adversarial Net1 learns to generate outputs that maximize the error of Net2. Thus Net1 learns to generate difficult data from which Net2 can still learn something. In the original work, both Net1 and Net2 were recurrent neural networks, such that they could also learn to generate and perceive sequences of actions and data.[9][10] In the 1990s, this led to many papers on artificial curiosity and zero sum games.[11]

In 1992, Schmidhuber published another unsupervised adversarial technique called predictability minimization. Net1 generates a code of incoming data. The code is a vector of numbers between 0 and 1. Net2 learns to predict each such number from the remaining numbers. It learns to minimize its error. As a consequence, Net2 learns the conditional expected value of each number, given the remaining numbers. At the same time, however, the adversarial Net1 learns to generate codes that maximize the error of Net2. In the ideal case, absent local minima, Net1 learns to encode redundant input patterns through codes with statistically independent components, while Net2 learns the probabilities of these codes, and therefore the probabilities of the encoded patterns.[12] In 1996, this zero sum game was applied to images, and produced codes similar to those found in the mammalian brain.[13]

The basic idea of generative adversarial networks was published in a 2010 blog post by Olli Niemitalo.[14]

The adversarial principle is not limited to artificial neural networks. For example, in 2012, Yan Zhou et al. applied it to support vector machines.[15]

The idea to infer models in a competitive setting (model versus discriminator) was adopted by Li, Gauci and Gross in 2013.[16] Their method is used for behavioral inference. It is termed Turing Learning,[17] as the setting is akin to that of a Turing test. Turing Learning is a generalization of GANs.[18] Models other than neural networks can be considered. Moreover, the discriminators are allowed to influence the processes from which the datasets are obtained, making them active interrogators as in the Turing test.

The name "GAN" was introduced by Ian Goodfellow et al. in 2014.[1][2] Their paper popularized the concept and influenced subsequent work.

In the late 2010s GANs began to produce disturbingly accurate syntheses. In 2017 a GAN was used for image enhancement focusing on realistic textures rather than pixel-accuracy, producing a higher image quality at high magnification.[19] GANs were used to create the 2018 painting Edmond de Belamy, which sold for $432,500.[20] Faces generated by StyleGAN[21] in 2019 drew comparisons with deep fakes[22] and poker tells.[23][24]


GAN applications have increased rapidly.[25] GANs that produce photorealistic images can be used to visualize interior/industrial design, shoes, bags and clothing items or items for computer games' scenes.[citation needed] Such networks were reported to be used by Facebook.[26]

GANs can model patterns of motion in video,[27] reconstruct 3D models of objects from images[28] and improve astronomical images.[29] StyleGAN itself is currently the sixth most trending Python project on GitHub. [30]

GANs can be used to age face photographs to show how an individual's appearance might change with age.[31]

In 2018, GANs reached the video game modding community, as a method of up-scaling low resolution 2D textures in old video games by recreating them in 4k or higher resolutions via image training, and then down-sampling them to fit the game's native resolution (with results resembling the "Super Sampling" anti-aliasing method).[32] With proper training, GANs provide a clearer and sharper 2D texture image magnitudes higher in quality than the original, while fully retaining the original's level of details, colors, etc. Known examples of extensive GAN usage include Final Fantasy VIII, Final Fantasy IX, Resident Evil REmake HD Remaster, Max Payne. GAN 2D texture modding can be applied only to PC game releases.


GANs potential in human image synthesis for sinister purposes has been bruited, e.g., to produce fake and/or incriminating photographs and videos.[33]


  1. ^ a b c d Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). Generative Adversarial Networks (PDF). Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680.
  2. ^ a b Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). "Generative Adversarial Networks". arXiv:1406.2661 [cs.LG].
  3. ^ Salimans, Tim; Goodfellow, Ian; Zaremba, Wojciech; Cheung, Vicki; Radford, Alec; Chen, Xi (2016). "Improved Techniques for Training GANs". arXiv:1606.03498 [cs.LG].
  4. ^ a b Thaler, SL, US Patent 05659666, Device for the autonomous generation of useful information, 08/19/1997.
  5. ^ a b Thaler, SL, US Patent, 07454388, Device for the autonomous bootstrapping of useful information, 11/18/2008.
  6. ^ Thaler, SL, The Creativity Machine Paradigm, Encyclopedia of Creativity, Invention, Innovation, and Entrepreneurship, (ed.) E.G. Carayannis, Springer Science+Business Media, LLC, 2013.
  7. ^ Luc, Pauline; Couprie, Camille; Chintala, Soumith; Verbeek, Jakob (2016-11-25). "Semantic Segmentation using Adversarial Networks". NIPS Workshop on Adversarial Training, Dec , Barcelona, Spain. 2016. arXiv:1611.08408. Bibcode:2016arXiv161108408L.
  8. ^ Andrej Karpathy, Pieter Abbeel, Greg Brockman, Peter Chen, Vicki Cheung, Rocky Duan, Ian Goodfellow, Durk Kingma, Jonathan Ho, Rein Houthooft, Tim Salimans, John Schulman, Ilya Sutskever, And Wojciech Zaremba, Generative Models, OpenAI, retrieved April 7, 2016CS1 maint: Uses authors parameter (link)
  9. ^ Schmidhuber, Jürgen (1990). "Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments" (PDF). TR FKI-126-90. Tech. Univ. Munich.
  10. ^ Schmidhuber, Jürgen (1991). "A possibility for implementing curiosity and boredom in model-building neural controllers". Proc. SAB'1991. MIT Press/Bradford Books. pp. 222–227.
  11. ^ Jürgen Schmidhuber (2018). "Unsupervised Neural Networks Fight in a Minimax Game". Retrieved Feb 20, 2019.
  12. ^ Schmidhuber, Jürgen (November 1992). "Learning Factorial Codes by Predictability Minimization". Neural Computation. 4 (6): 863–879. doi:10.1162/neco.1992.4.6.863.
  13. ^ Schmidhuber, Jürgen; Eldracher, Martin; Foltin, Bernhard (1996). "Semilinear predictability minimzation produces well-known feature detectors". Neural Computation. 8 (4): 773–786.
  14. ^ Niemitalo, Olli (February 24, 2010). "A method for training artificial neural networks to generate missing data within a variable context". Internet Archive (Wayback Machine). Retrieved February 22, 2019.
  15. ^ Zhou, Yan; Kantarcioglu, Murat; Thuraisingham, Bhavani; Xi, Bowei (August 12–16, 2012). "Adversarial Support Vector Machine Learning". Proceedings of KDD’12. Beijing, China: ACM.
  16. ^ Li, Wei; Gauci, Melvin; Gross, Roderich (July 6, 2013). "A Coevolutionary Approach to Learn Animal Behavior Through Controlled Interaction". Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation (GECCO 2013). Amsterdam, The Netherlands: ACM. pp. 223–230. doi:10.1145/2463372.2465801.
  17. ^ Li, Wei; Gauci, Melvin; Groß, Roderich (30 August 2016). "Turing learning: a metric-free approach to inferring behavior and its application to swarms". Swarm Intelligence. 10 (3): 211–243. doi:10.1007/s11721-016-0126-1.
  18. ^ Gross, Roderich; Gu, Yue; Li, Wei; Gauci, Melvin (December 6, 2017). "Generalizing GANs: A Turing Perspective". Proceedings of the Thirty-first Annual Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA, USA. pp. 1–11.
  19. ^ Sajjadi, Mehdi S. M.; Schölkopf, Bernhard; Hirsch, Michael (2016-12-23). "EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis". arXiv:1612.07919 [cs.CV].
  20. ^ Cohn, Gabe (2018-10-25). "AI Art at Christie's Sells for $432,500". The New York Times.
  21. ^ "StyleGAN: Official TensorFlow Implementation. Contribute to NVlabs/stylegan development by creating an account on GitHub". March 2, 2019 – via GitHub.
  22. ^ Paez, Danny (2019-02-13). "This Person Does Not Exist Is the Best One-Off Website of 2019". Retrieved 2019-02-16.
  23. ^ BESCHIZZA, ROB (2019-02-15). "This Person Does Not Exist". Boing-Boing. Retrieved 2019-02-16.
  24. ^ Horev, Rani (2018-12-26). "Style-based GANs – Generating and Tuning Realistic Artificial Faces". Lyrn.AI. Retrieved 2019-02-16.
  25. ^ Caesar, Holger (2019-03-01), A list of papers on Generative Adversarial (Neural) Networks: nightrome/really-awesome-gan, retrieved 2019-03-02
  26. ^ Greenemeier, Larry (June 20, 2016). "When Will Computers Have Common Sense? Ask Facebook". Scientific American. Retrieved July 31, 2016.
  27. ^ Vondrick, Carl; Pirsiavash, Hamed; Torralba, Antonio (2016). "Generating Videos with Scene Dynamics". arXiv:1609.02612. Bibcode:2016arXiv160902612V.
  28. ^ "3D Generative Adversarial Network".
  29. ^ Schawinski, Kevin; Zhang, Ce; Zhang, Hantian; Fowler, Lucas; Santhanam, Gokula Krishnan (2017-02-01). "Generative Adversarial Networks recover features in astrophysical images of galaxies beyond the deconvolution limit". Monthly Notices of the Royal Astronomical Society: Letters. 467 (1): L110–L114. arXiv:1702.00403. Bibcode:2017MNRAS.467L.110S. doi:10.1093/mnrasl/slx008.
  30. ^ Roboticist, Javad Amirian (February 14, 2019). "#StyleGan, is ranked 6th among trending python projects on #Github. #DeepLearning # Generative Networks".
  31. ^ Antipov, Grigory; Baccouche, Moez; Dugelay, Jean-Luc (2017). "Face Aging With Conditional Generative Adversarial Networks". arXiv:1702.01983 [cs.CV].
  32. ^ Tang, Xiaoou; Qiao, Yu; Loy, Chen Change; Dong, Chao; Liu, Yihao; Gu, Jinjin; Wu, Shixiang; Yu, Ke; Wang, Xintao (2018-09-01). "ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks".
  33. ^ msmash, n/a (2019-02-14). "'This Person Does Not Exist' Website Uses AI To Create Realistic Yet Horrifying Faces". Slashdot. Retrieved 2019-02-16.

External links[edit]