Generative Adversarial Networks (GAN)

Neural networks are used for recognizing pictures, understanding natural language with good accuracy, automate driving vehicles and a lot of other applications follow. But still, neural networks need human supervision to learn. Usually, a network needs labeled examples to learn effectively. While it’s also possible to learn from unlabeled data, this had typically not worked very well.

GANs was proposed by Ian Goodfellow (currently a staff research scientist at Google Brain). By applying game theory, he devised a way for a machine-learning system to effectively teach itself about how the world works. This ability could help make computers smarter by sidestepping the need to feed them painstakingly labeled training data. Generative Adversarial Networks (GANs) are neural networks that are trained in an adversarial manner to generate data mimicking some distribution.

To explain it in simpler terms: 

If you want to get better at something, say chess; what would you do? You would compete with an opponent better than you. Then you would analyze what you did wrong, what he/she did right, and think on what could you do to beat him/her in the next game.

You would repeat this step until you defeat the opponent. This concept can be incorporated to build better models. So simply, for getting a powerful hero (viz generator), we need a more powerful opponent (viz discriminator)!

To understand this deeply, first, you’ll have to understand what a generative and a discriminative model is.

In machine learning, the two main classes of models

  • Discriminative – A discriminative model is one that discriminates between two (or more) different classes of data – for example, a convolutional neural network that is trained to output 1 given an image of a human face and 0 otherwise.
  • Generative –  A generative model, on the other hand, doesn’t know anything about classes of data.  Instead, its purpose is to generate new data which fits the distribution of the training data – for example, a Gaussian Mixture Model is a generative model which, after trained on a set of points, is able to generate new random points which more-or-less fit the distribution of the training data (assuming a GMM is able to mimic the data well).

The generative network’s training objective is to increase the error rate of the discriminative network (i.e., “fool” the discriminator network by producing novel synthesized instances that appear to have come from the true data distribution). In practice, a known dataset serves as the initial training data for the discriminator. Training the discriminator involves presenting it with samples from the dataset until it reaches some level of accuracy. Typically the generator is seeded with a randomized input that is sampled from a predefined latent space (e.g. a multivariate normal distribution). Thereafter, samples synthesized by the generator are evaluated by the discriminator. Backpropagation is applied in both networks so that the generator produces better images, while the discriminator becomes more skilled at flagging synthetic images. The generator is typically a deconvolutional neural network, and the discriminator is a convolutional neural network.

Working of GANs explained:

So as we saw, there are two components in GANs

  1. Generator Neural Network
  2. Discriminator Neural Network

g1

The Generator Network takes a random input and tries to generate a sample of data. In the above image, we can see that generator G(z) takes an input z from p(z), where z is a sample from probability distribution p(z). It then generates a data which is then fed into a discriminator network D(x). The task of Discriminator Network is to take input either from the real data or from the generator and try to predict whether the input is real or generated. It takes an input x from pdata(x) where pdata(x) is our real data distribution. D(x) then solves a binary classification problem using sigmoid function giving output in the range 0 to 1.

Defining the notations used:

Pdata(x) -> the distribution of real data
X -> sample from pdata(x)
P(z) -> distribution of generator
Z -> sample from p(z)
G(z) -> Generator Network
D(x) -> Discriminator Network

Steps to train a GAN: 

Step 1: Define the problem. Do you want to generate fake images or fake texts? Here you should completely define the problem and collect the data for it.

Step 2: Define architecture of GAN. Define how your GAN should look like. Should both your generator and discriminator be multi layer perceptrons or convolutional neural networks? This step will depend on what problem you are trying to solve.

Step 3: Train Discriminator on real data for (n) epochs. Get the data you want to generate fake on and train the discriminator to correctly predict them as real. Here value (n) can be any natural number between 1 and infinity.

Step 4: Generate fake inputs for generator and train discriminator on fake data. Get generated data and let the discriminator correctly predict them as fake.

Step 5: Train generator with the output of discriminator. Now when the discriminator is trained, you can get its predictions and use it as an objective for training the generator. Train the generator to fool the discriminator.

Step 6: Repeat step 3 to step 5 for a few epochs.

Step 7: Check if the fake data manually if it seems legit. If it seems appropriate, stop training, else go to step 3. This is a bit of a manual task, as hand evaluation of the data is the best way to check the fakeness. When this step is over, you can evaluate whether the GAN is performing well enough.

The dueling-neural-network approach has vastly improved learning from unlabeled data. GANs can already perform some dazzling tricks. By internalizing the characteristics of a collection of photos, for example, a GAN can improve the resolution of a pixelated image. It can also dream up realistic fake photos, or apply a particular artistic style to an image. “You can think of generative models as giving artificial intelligence a form of imagination,” Goodfellow says.

 

 

(Sources: Wikipedia, MitTechReview, AnalyticsVidhya)

Advertisements

MIT uses AI to kill video buffering

We all hate it when our video is interrupted by buffering or its resolution has turned into a pixelated mess. A group of MIT researchers believe they’ve figured out a solution to those annoyances plaguing millions of people a day.

MIT discovered a way to improve video streaming by reducing buffering times and pixelation. A new AI developed at the university’s Computer Science and Artificial Intelligence Laboratory uses machine learning to pick different algorithms depending on network conditions. In doing so, the AI, called Pensieve, has been shown to deliver a higher-quality streaming experience with less buffering than existing systems.

Instead of having a video arrive at your computer in one complete piece, sites like YouTube and Netflix break it up into smaller pieces and sends them sequentially, relying on ABR algorithms to determine which resolution each piece will play at. This is an attempt to give users a more consistent viewing experience while also saving bandwidth, but it created problems. If the connection is too slow, YouTube may temporarily lower the resolution – pixelating the video- to keep it playing. And since the video is sent in pieces, skipping ahead is impossible.

There are two types of ABR: a rate-based one that measures how fast a network can transmit data and a buffer-based one tasked with maintaining a sufficient buffer at the head of the video.  The current algorithms only consider one of these factors, but MIT’s new algorithm Pensieve uses machine learning to choose the best system based on the network condition

In experiments that tested the AI using wifi and LTE, the team found that it could stream video at the same resolution with 10 to 30 percent less rebuffering than other approaches. Additionally, users rated the video play with the AI 10 to 25 percent higher in terms of ‘quality of experience.’  The researchers, however, only tested Pensieve on a month’s worth of downloaded video and believe performance would be even higher with a number of data streaming giants YouTube and Netflix have.

As a next project, the team will be working to test Pensieve on virtual-reality (VR) video.“The bitrates you need for 4K-quality VR can easily top hundreds of megabits per second, which today’s networks simply can’t support,” Alizadeh says. “We’re excited to see what systems like Pensieve can do for things like VR. This is really just the first step in seeing what we can do.”

Pensieve was funded, in part, by the National Science Foundation and an innovation fellowship from Qualcomm.

Open AI’s bots beat world’s top Dota 2 players

Open AI has created a bot that is taking down top Dota 2 players at The International Dota event held by Valve.

This year’s big Dota 2 tournament is currently in its second to last day with only four teams remaining in the competition. Between the event’s official games, Valve always includes some fun show matches, and this year it brought in Open AI‘s aforementioned Dota 2 b to face off against the game’s most famous player, Danil “Dendi” Ishutin, live on the main stage.

Open AI said the 1v1 bot has learned from playing against itself in a lifetimes worth of matches. In a promo video shown before the match, we see the bot besting some of the game’s top players, including Evil Genius’ cores Arteezy and Sumail.

In the live Shadow Fiend mirror match against Dendi, the bot executed an impossibly perfect creep block, perfectly balanced creep aggro, and even recognized and canceled Dendi’s healing items. The bot quickly bested Dendi in two matches, leading Dendi to say it’s “too strong!” Footage of those matches can be found on Dota 2 Rapier’s YouTube channel.

Open AI says their next goal is to get the bot ready for the infinitely more complex 5v5 matches, noting it might have something ready for next year.

How DeepMind’s AI taught itself to walk

DeepMind’s programmers have given the agent a set of virtual sensors (so it can tell whether it’s upright or not, for example) and then incentivize to move forward. The computer works the rest out for itself, using trial and error to come up with different ways of moving. True motor intelligence requires learning how to control and coordinate a flexible body to solve tasks in a range of complex environments. Existing attempts to control physically simulated humanoid bodies come from diverse fields, including computer animation and biomechanics.  A trend has been to use hand-crafted objectives, sometimes with motion capture data, to produce specific behaviors. However, this may require considerable engineering effort and can result in restricted behaviors or behaviors that may be difficult to repurpose for new tasks.

True motor intelligence requires learning how to control and coordinate a flexible body to solve tasks in a range of complex environments. Existing attempts to control physically simulated humanoid bodies come from diverse fields, including computer animation and biomechanics.  A trend has been to use hand-crafted objectives, sometimes with motion capture data, to produce specific behaviors. However, this may require considerable engineering effort and can result in restricted behaviors or behaviors that may be difficult to repurpose for new tasks.

 

DeepMind published 3 papers which are as follows:

Emergence of locomotion behaviors in rich environments:- 

For some AI problems, such as playing Atari or Go, the goal is easy to define – it’s winning. But describing a process such as a jog, a backflip or a jump is difficult because of accurately describing a complex behavior which is a common problem when teaching motor skills to an artificial system. DeepMind explored how sophisticated behaviors can emerge from scratch from the body interacting with the environment using only simple high-level objectives, such as moving forward without falling. DeepMind trained agents with a variety of simulated bodies to make progress across diverse terrains, which require jumping, turning and crouching. The results show that their agents developed these complex skills without receiving specific instructions, an approach that can be applied to train systems for multiple, distinct simulated bodies.

wall

Simulated planar Walker attempts to climb a wall

But how do you describe the process for performing a backflip? Or even just a jump? The difficulty of accurately describing a complex behavior is a common problem when teaching motor skills to an artificial system. In this work, we explore how sophisticated behaviors can emerge from scratch from the body interacting with the environment using only simple high-level objectives, such as moving forward without falling. Specifically, we trained agents with a variety of simulated bodies to make progress across diverse terrains, which require jumping, turning and crouching. The results show our agents develop these complex skills without receiving specific instructions, an approach that can be applied to train our systems for multiple, distinct simulated bodies. The GIFs show how this technique can lead to high-quality movements and perseverance.

Learning human behaviors from motion capture by adversarial imitation:-

walk.gif

A humanoid Walker produces human-like walking behavior

The emergent behavior described above can be very robust, but because the movements must emerge from scratch, they often do not look human-like.

DeepMind in their second paper demonstrated how to train a policy network that imitates motion capture data of human behaviors to pre-learn certain skills, such as walking, getting up from the ground, running, and turning.

Having produced behavior that looks human-like, they can tune and repurpose those behaviors to solve other tasks, like climbing stairs and navigating walled corridors.

 

Robust imitation of diverse behaviors:-

div.gif

The planar Walker on the left demonstrates a particular walking style and the agent in the right panel imitates that style using a single policy network.

The third paper proposes a neural network architecture, building on state-of-the-art generative models, that is capable of learning the relationships between different behaviors and imitating specific actions that it is shown. After training, their system can encode a single observed action and create a new novel movement based on that demonstration.After training, their system can encode a single observed action and create a new novel movement based on that demonstration.It can also switch between different kinds of behaviors despite never having seen transitions between them, for example switching between walking styles.

 

 

(via The Verge, DeepMind Blog)

 

 

MIT’s new prototype ‘3D’ Chip

Daily a plethora of data is being generated and the computing power to process this data into useful information is stalling. One of the fundamental problems being faced is the processor-memory bottleneck or the performance gap. Various methods such as caches and different software techniques have been used to eliminate this problem. But there’s another way which is, building the CPU directly into a 3D memory structure, connect them directly without any kind of motherboard traces, and compute from within the RAM itself.

3d.jpeg

 

A prototype chip built by researchers at Stanford and MIT can solve the problem by sandwiching the memory, processor and even sensors all into one unit. While current chips are made of silicon

.

rram.jpg

The researchers have developed a new 3D chip fabrication method that uses carbon nanotubes and resistive random-access memory (RRAM) cells together to create a combined nanoelectronic processor design that supports complex, 3D architecture – where traditional silicon-based chip fabrication works with 2D structures only. The team claims this makes for “the most complex nanoelectronic system ever made with emerging nanotechnologies,” creating a 3D computer architecture. Using carbon makes the whole thing possible since higher temperatures required to make a silicon CPU would damage the sensitive RRAM cells.

The 3D design is possible because these carbon nanotube circuits and RRAM memory components can be made using temperatures below 200 degrees Celsius, which is far, far less than the 1,000 degree temps needed to fabricate today’s 2D silicon transistors. Lower temperatures mean you can build an upper layer on top of another without damaging the one or ones below.

One expert cited by MIT said that this could be the answer to continuing the exponential scaling of computer power in keeping with Moore’s Law, as traditional chip methods start to run up against physical limits. It’s still in its initial phases and it would take many years till we see the actual implementation of these chips in real life.

Response of an artificial iris to light like the human eye

An artificial iris manufactured from intelligent, light-controlled polymer material can react to incoming light in the same ways as the human eye. The Iris was developed by the Smart Photonic Materials research group from the TUT, and it was recently published in the Advanced Materials journal.

iris.jpg

The human iris does its job of adjusting your pupil size to meter the amount of light hitting the retina behind without you having to actively think about it. And while a camera’s aperture is designed to work the same way as a biological iris, it’s anything but automatic. Even point-and-shoots rely on complicated control mechanisms to keep your shots from becoming overexposed. But a new “artificial iris” developed at the Tampere University of Technology in Finland can autonomously adjust itself based on how bright the scene is.

Scientists from the Smart Photonic Materials research group developed the iris using a light-sensitive liquid crystal elastomer. The team also employed photoalignment techniques, which accurately position the liquid crystal molecules in a predetermined direction within a tolerance of a few picometers. This is similar to the techniques used originally in LCD TVs to improve viewing angle and contrast but has since been adopted to smartphone screens. “The artificial iris looks a little bit like a contact lens,” TUT Associate Professor Arri Priimägi said. “Its center opens and closes according to the amount of light that hits it.”

The team hopes to eventually develop this technology into an implantable biomedical device. However, before that can happen, the TUT researchers need to first improve the iris’ sensitivity so that it can adapt to smaller changes in brightness. They also need to get it to work in an aqueous environment. However, this new iris is therefore still long ways away from being ready.

Optical Deep Learning: Learning with light

Artificial Neural Networks mimic the way the brain learns from an accumulation of examples. A research team at the Massachusetts Institute of Technology (MIT) has come up with a novel approach to deep learning that uses a nanophotonic processor, which they claim can vastly improve the performance and energy efficiency for processing artificial neural networks.

MIT-Optical-Neural_0.jpg

Optical computers have been conceived before, but are usually aimed at more general-purpose processing. In this case, the researchers have narrowed the application domain considerably. Not only have they restricted their approach to deep learning, they have further limited this initial work to inferencing of neural networks, rather than the more computationally demanding process of training.

The optical chip contains multiple waveguides that shoot beams of light at each other simultaneously, creating interference patterns that correspond to mathematical results. “The chip, once you tune it, can carry out matrix multiplication with, in principle, zero energy, instantly,” said Marin Soljacic, an electrical engineering professor at MIT, in a statement. The accuracy leaves something to be desired for making out vowels, but the nanophotonic chip is still work-in-progress.

The new approach uses multiple light beams(as mentioned above) directed in such a way that their waves interact with each other, producing interference patterns that convey the result of the intended operation. The resulting device is something the researchers call a programmable nanophotonic processor.The result is that the optical chips using this architecture could, in principle, carry out calculations performed in typical artificial intelligence algorithms much faster and using less than one-thousandth as much energy per operation as conventional electronic chips.
The team says it will still take a lot more effort and time to make this system useful; however, once the system is scaled up and fully functioning, it can find many user cases, such as data centers or security systems. The system could also be a boon for self-driving cars or drones, says Harris, or “whenever you need to do a lot of computation but you don’t have a lot of power or time.”

 

What is Blockchain?

Blockchains in Bitcoin.

 

A blockchain is a public ledger of all Bitcoin transactions that have ever been executed. It is constantly growing as ‘completed’ blocks are added to it with a new set of recordings. The blocks are added to the blockchain in a linear, chronological order. Each node (computer connected to the Bitcoin network using a client that performs the task of validating and relaying transactions) gets a copy of the blockchain, which gets downloaded automatically upon joining the Bitcoin network. The blockchain has complete information about the addresses and their balances right from the genesis block to the most recently completed block.

Blockchain

Blockchain formation.

The blockchain is seen as the main technological innovation of Bitcoin since it stands as proof of all the transactions on the network. A block is the ‘current’ part of a blockchain which records some or all of the recent transactions, and once completed goes into the blockchain as a permanent database. Each time a block gets completed, a new block is generated. There is a countless number of such blocks in the blockchain. So are the blocks randomly placed in a blockchain? No, they are linked to each other (like a chain) in proper linear, chronological order with every block containing a hash of the previous block.

To use conventional banking as an analogy, the blockchain is like a full history of banking transactions. Bitcoin transactions are entered chronologically in a blockchain just the way bank transactions are. Blocks, meanwhile, are like individual bank statements.

Based on the Bitcoin protocol, the blockchain database is shared by all nodes participating in a system. The full copy of the blockchain has records of every Bitcoin transaction ever executed. It can thus provide insight about facts like how much value belonged a particular address at any point in the past.

In the above representation, the main chain (black) consists of the longest series of blocks from the genesis block (green) to the current block. Orphan blocks (purple) exist outside of the main chain.

 

 

infographics0517-01

Working of Bitcoin and Blockchain

 

 A blockchain – originally block chain– is a distributed database that is used to maintain a continuously growing list of records, called blocks. Each block contains a timestamp and a link to a previous block. A blockchain is typically managed by a peer-to-peer network collectively adhering to a protocol for validating new blocks. By design, blockchains are inherently resistant to modification of the data. Once recorded, the data in any given block cannot be altered retroactively without the alteration of all subsequent blocks and the collusion of the network. Functionally, a blockchain can serve as “an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable and permanent way. The ledger itself can also be programmed to trigger transactions automatically.”

Intel Core i9

Intel recently announced a new family of processors for enthusiasts, the Core X-series, and it’s anchored by the company’s first 18-core CPU, the i9-7980XE.

 

Intel+Core+i9+x+series.jpg

Priced at $1,999, the 7980XE is clearly not a chip you’ll see in an average desktop. Instead, it’s more of a statement from Intel. It beats out AMD’s 16-core Threadripper CPU, which was slated to be that company’s most powerful consumer processor for 2017. And it gives Intel yet another way to satisfy the demands of power-hungry users who might want to do things like play games in 4K while broadcasting them in HD over Twitch. And, as if its massive core count wasn’t enough, the i9-7980XE is also the first Intel consumer chip that packs in over a teraflop’s worth of computing power.

 

inteli9.jpg

 

If 18 cores are overkill for you, Intel also has other Core i9 Extreme Edition chips in 10-, 12-, 14- and 16-core variants. Perhaps the best news for hardware geeks: The 10 Core i9-7900X will retail for $999, a significant discount from last year’s version.

All of the i9 chips feature base clock speeds of 3.3GHz, reaching up to 4.3GHz dual-core speeds with Turbo Boost 2.0 and 4.5GHz with Turbo Boost 3.0 a new version of Turbo Boost which Intel has upgraded. The company points out that while the additional cores on the Core X models will improve multitasking performance, the addition of technologies like Turbo Boost Max 3.0 ensures that each core is also able to achieve improved performance. (Intel claims that the Core X series reaches 10 percent faster multithread performance over the previous generation and 15 percent faster single thread.)

 

 

(via Engadget, The Verge)

 

Google’s and Nvidia’s AI Chips

Google

Google will soon launch a cloud computing service that provides exclusive access to a new kind of artificial intelligence chip designed by its own engineers. CEO Sundar Pichai revealed the new chip and service this morning in Silicon Valley during his keynote at Google I/O, the company’s annual developer conference.

GoogleChip4.jpg

This new processor is a unique creation designed to both train and execute deep neural networks—machine learning systems behind the rapid evolution of everything from image and speech recognition to automated translation to robotics. Google says it will not sell the chip directly to others. Instead, through its new cloud service, set to arrive sometime before the end of the year, any business or developer can build and operate software via the internet that taps into hundreds and perhaps thousands of these processors, all packed into Google data centers.

According to Dean, Google’s new “TPU device,” which spans four chips, can handle 180 trillion floating point operations per second, or 180 teraflops, and the company uses a new form of computer networking to connect several of these chips together, creating a “TPU pod” that provides about 11,500 teraflops of computing power. In the past, Dean said, the company’s machine translation model took about a day to train on 32 state-of-the-art CPU boards. Now, it can train in about six hours using only a portion of a pod.

Nvidia

Nvidia has released a new state-of-the-art chip that pushes the limits of machine learning, the Tesla P100 GPU. It can perform deep learning neural network tasks 12 times faster than the company’s previous top-end system (The TitanX). The P100 was a huge commitment for Nvidia, costing over $2 billion in research and development, and it sports a whopping 150 billion transistors on a single chip, making the P100 the world’s largest chip, Nvidia claims. In addition to machine learning, the P100 will work for all sorts of high-performance computing tasks — Nvidia just wants you to know it’s really good at machine learning.

dgx.png

To top off the P100’s introduction, Nvidia has packed eight of them into a crazy-powerful $129,000 supercomputer called the DGX-1. This show-horse of a machine comes ready to run, with deep-learning software preinstalled. It’s shipping first to AI researchers at MIT, Stanford, UC Berkeley, and others in June. On stage, Huang called the DGX-1 “one beast of a machine.”

The competition between these upcoming AI chips and Nvidia all points to an emerging need for simply more processing power in deep learning computing. A few years ago, GPUs took off because they cut the training time for a deep learning network from months to days. Deep learning, which had been around since at least the 1950s, suddenly had real potential with GPU power behind it. But as more companies try to integrate deep learning into their products and services, they’re only going to need faster and faster chips.

 

(via Wired, Forbes, Nvidia, The Verge)