How DeepMind’s AI taught itself to walk

DeepMind’s programmers have given the agent a set of virtual sensors (so it can tell whether it’s upright or not, for example) and then incentivize to move forward. The computer works the rest out for itself, using trial and error to come up with different ways of moving. True motor intelligence requires learning how to control and coordinate a flexible body to solve tasks in a range of complex environments. Existing attempts to control physically simulated humanoid bodies come from diverse fields, including computer animation and biomechanics.  A trend has been to use hand-crafted objectives, sometimes with motion capture data, to produce specific behaviors. However, this may require considerable engineering effort and can result in restricted behaviors or behaviors that may be difficult to repurpose for new tasks.

True motor intelligence requires learning how to control and coordinate a flexible body to solve tasks in a range of complex environments. Existing attempts to control physically simulated humanoid bodies come from diverse fields, including computer animation and biomechanics.  A trend has been to use hand-crafted objectives, sometimes with motion capture data, to produce specific behaviors. However, this may require considerable engineering effort and can result in restricted behaviors or behaviors that may be difficult to repurpose for new tasks.

 

DeepMind published 3 papers which are as follows:

Emergence of locomotion behaviors in rich environments:- 

For some AI problems, such as playing Atari or Go, the goal is easy to define – it’s winning. But describing a process such as a jog, a backflip or a jump is difficult because of accurately describing a complex behavior which is a common problem when teaching motor skills to an artificial system. DeepMind explored how sophisticated behaviors can emerge from scratch from the body interacting with the environment using only simple high-level objectives, such as moving forward without falling. DeepMind trained agents with a variety of simulated bodies to make progress across diverse terrains, which require jumping, turning and crouching. The results show that their agents developed these complex skills without receiving specific instructions, an approach that can be applied to train systems for multiple, distinct simulated bodies.

wall

Simulated planar Walker attempts to climb a wall

But how do you describe the process for performing a backflip? Or even just a jump? The difficulty of accurately describing a complex behavior is a common problem when teaching motor skills to an artificial system. In this work, we explore how sophisticated behaviors can emerge from scratch from the body interacting with the environment using only simple high-level objectives, such as moving forward without falling. Specifically, we trained agents with a variety of simulated bodies to make progress across diverse terrains, which require jumping, turning and crouching. The results show our agents develop these complex skills without receiving specific instructions, an approach that can be applied to train our systems for multiple, distinct simulated bodies. The GIFs show how this technique can lead to high-quality movements and perseverance.

Learning human behaviors from motion capture by adversarial imitation:-

walk.gif

A humanoid Walker produces human-like walking behavior

The emergent behavior described above can be very robust, but because the movements must emerge from scratch, they often do not look human-like.

DeepMind in their second paper demonstrated how to train a policy network that imitates motion capture data of human behaviors to pre-learn certain skills, such as walking, getting up from the ground, running, and turning.

Having produced behavior that looks human-like, they can tune and repurpose those behaviors to solve other tasks, like climbing stairs and navigating walled corridors.

 

Robust imitation of diverse behaviors:-

div.gif

The planar Walker on the left demonstrates a particular walking style and the agent in the right panel imitates that style using a single policy network.

The third paper proposes a neural network architecture, building on state-of-the-art generative models, that is capable of learning the relationships between different behaviors and imitating specific actions that it is shown. After training, their system can encode a single observed action and create a new novel movement based on that demonstration.After training, their system can encode a single observed action and create a new novel movement based on that demonstration.It can also switch between different kinds of behaviors despite never having seen transitions between them, for example switching between walking styles.

 

 

(via The Verge, DeepMind Blog)

 

 

MIT’s new prototype ‘3D’ Chip

Daily a plethora of data is being generated and the computing power to process this data into useful information is stalling. One of the fundamental problems being faced is the processor-memory bottleneck or the performance gap. Various methods such as caches and different software techniques have been used to eliminate this problem. But there’s another way which is, building the CPU directly into a 3D memory structure, connect them directly without any kind of motherboard traces, and compute from within the RAM itself.

3d.jpeg

 

A prototype chip built by researchers at Stanford and MIT can solve the problem by sandwiching the memory, processor and even sensors all into one unit. While current chips are made of silicon

.

rram.jpg

The researchers have developed a new 3D chip fabrication method that uses carbon nanotubes and resistive random-access memory (RRAM) cells together to create a combined nanoelectronic processor design that supports complex, 3D architecture – where traditional silicon-based chip fabrication works with 2D structures only. The team claims this makes for “the most complex nanoelectronic system ever made with emerging nanotechnologies,” creating a 3D computer architecture. Using carbon makes the whole thing possible since higher temperatures required to make a silicon CPU would damage the sensitive RRAM cells.

The 3D design is possible because these carbon nanotube circuits and RRAM memory components can be made using temperatures below 200 degrees Celsius, which is far, far less than the 1,000 degree temps needed to fabricate today’s 2D silicon transistors. Lower temperatures mean you can build an upper layer on top of another without damaging the one or ones below.

One expert cited by MIT said that this could be the answer to continuing the exponential scaling of computer power in keeping with Moore’s Law, as traditional chip methods start to run up against physical limits. It’s still in its initial phases and it would take many years till we see the actual implementation of these chips in real life.