Baidu Research’s New AI Algorithm Mimics Voice With Very Few Samples

AI typically needs a plethora of data and a lot of time for something like voice cloning. It needs to listen to hours of recordings. However, a new process could get that down to one minute. Baidu researchers have unveiled an upgraded version of Deep Voice, their text-to-speech synthesis system, that can now, once trained, clone any voice after listening to a few snippets of audio. This capability was enabled by learning shared and discriminative information from speakers. Baidu calls this ‘Voice Cloning’. Voice cloning is expected to have significant applications in the direction of personalization in human-machine interfaces.

bAIDU

Here, Baidu focuses on two fundamental approaches (refer above figure):

  1. Speaker AdaptionSpeaker adaptation is based on fine-tuning a multi-speaker generative model with a few cloning samples, by using backpropagation-based optimization. Adaptation can be applied to the whole model or only the low-dimensional speaker embeddings. The latter enables a much lower number of parameters to represent each speaker, albeit it yields a longer cloning time and a lower audio quality.
  2. Speaker EncodingSpeaker encoding is based on training a separate model to directly infer a new speaker embedding from cloning audios that will ultimately be used with a multi-speaker generative model. The speaker encoding model has time-and-frequency-domain processing blocks to retrieve speaker identity information from each audio sample, and attention blocks to combine them in an optimal way.

For detailed information and mathematical explanations, refer the paper by Baidu Research.

However, this technology can also possibly have a downside as this could be tumultuous to people relying upon biometric voice security.

( via MitTechReview, Wiki, BaiduResearch)

Intel Core i9

Intel recently announced a new family of processors for enthusiasts, the Core X-series, and it’s anchored by the company’s first 18-core CPU, the i9-7980XE.

 

Intel+Core+i9+x+series.jpg

Priced at $1,999, the 7980XE is clearly not a chip you’ll see in an average desktop. Instead, it’s more of a statement from Intel. It beats out AMD’s 16-core Threadripper CPU, which was slated to be that company’s most powerful consumer processor for 2017. And it gives Intel yet another way to satisfy the demands of power-hungry users who might want to do things like play games in 4K while broadcasting them in HD over Twitch. And, as if its massive core count wasn’t enough, the i9-7980XE is also the first Intel consumer chip that packs in over a teraflop’s worth of computing power.

 

inteli9.jpg

 

If 18 cores are overkill for you, Intel also has other Core i9 Extreme Edition chips in 10-, 12-, 14- and 16-core variants. Perhaps the best news for hardware geeks: The 10 Core i9-7900X will retail for $999, a significant discount from last year’s version.

All of the i9 chips feature base clock speeds of 3.3GHz, reaching up to 4.3GHz dual-core speeds with Turbo Boost 2.0 and 4.5GHz with Turbo Boost 3.0 a new version of Turbo Boost which Intel has upgraded. The company points out that while the additional cores on the Core X models will improve multitasking performance, the addition of technologies like Turbo Boost Max 3.0 ensures that each core is also able to achieve improved performance. (Intel claims that the Core X series reaches 10 percent faster multithread performance over the previous generation and 15 percent faster single thread.)

 

 

(via Engadget, The Verge)

 

Google’s and Nvidia’s AI Chips

Google

Google will soon launch a cloud computing service that provides exclusive access to a new kind of artificial intelligence chip designed by its own engineers. CEO Sundar Pichai revealed the new chip and service this morning in Silicon Valley during his keynote at Google I/O, the company’s annual developer conference.

GoogleChip4.jpg

This new processor is a unique creation designed to both train and execute deep neural networks—machine learning systems behind the rapid evolution of everything from image and speech recognition to automated translation to robotics. Google says it will not sell the chip directly to others. Instead, through its new cloud service, set to arrive sometime before the end of the year, any business or developer can build and operate software via the internet that taps into hundreds and perhaps thousands of these processors, all packed into Google data centers.

According to Dean, Google’s new “TPU device,” which spans four chips, can handle 180 trillion floating point operations per second, or 180 teraflops, and the company uses a new form of computer networking to connect several of these chips together, creating a “TPU pod” that provides about 11,500 teraflops of computing power. In the past, Dean said, the company’s machine translation model took about a day to train on 32 state-of-the-art CPU boards. Now, it can train in about six hours using only a portion of a pod.

Nvidia

Nvidia has released a new state-of-the-art chip that pushes the limits of machine learning, the Tesla P100 GPU. It can perform deep learning neural network tasks 12 times faster than the company’s previous top-end system (The TitanX). The P100 was a huge commitment for Nvidia, costing over $2 billion in research and development, and it sports a whopping 150 billion transistors on a single chip, making the P100 the world’s largest chip, Nvidia claims. In addition to machine learning, the P100 will work for all sorts of high-performance computing tasks — Nvidia just wants you to know it’s really good at machine learning.

dgx.png

To top off the P100’s introduction, Nvidia has packed eight of them into a crazy-powerful $129,000 supercomputer called the DGX-1. This show-horse of a machine comes ready to run, with deep-learning software preinstalled. It’s shipping first to AI researchers at MIT, Stanford, UC Berkeley, and others in June. On stage, Huang called the DGX-1 “one beast of a machine.”

The competition between these upcoming AI chips and Nvidia all points to an emerging need for simply more processing power in deep learning computing. A few years ago, GPUs took off because they cut the training time for a deep learning network from months to days. Deep learning, which had been around since at least the 1950s, suddenly had real potential with GPU power behind it. But as more companies try to integrate deep learning into their products and services, they’re only going to need faster and faster chips.

 

(via Wired, Forbes, Nvidia, The Verge)

 

Machine Learning Speeds Up

Cloudera and Intel are jointly speeding up Machine Learning, with the help of Intel’s new Math Kernel. Benchmarks demonstrate the combined offering can advance machine learning performance over large data sets in less time and with less hardware.  This helps organizations accelerate their investments in next generation predictive analytics.

Cloudera is the leader in Apache Spark development, training, and services. Apache Spark is advancing the art of machine learning on distributed systems with familiar tools that deliver at impressive scale. By joining forces, Cloudera and Intel are furthering a joint mission of excellence in big data management in the pursuit of better outcomes by making machine learning smarter and easier to implement.

intcloud.jpg

Predictive Maintenance

By combining Spark, Intel MKL libraries, and Intel’s optimized CPU architecture machine learning workloads can scale quickly. As machine learning solutions get access to more data they can provide better accuracy in delivering predictive maintenance, recommendation engines, proactive health care and monitoring, and risk and fraud detection.

“There’s a growing urgency to implement richer machine learning models to explore and solve the most pressing business problems and to impact society in a more meaningful way,” said Amr Awadallah, chief technical officer of Cloudera. “Already among our user base, machine learning is an increasingly common practice. In fact, in a recent adoption survey over 30% of respondents indicated they are leveraging Spark for machine learning.

 

(via – Technative.io)

The Poker Playing AI

As we know that the game of Poker involves dealing with imperfect information, which makes the game very complex, and more like many real-world situations. At the Rivers Casino in Pittsburgh this week, a computer program called Libratus (A latin word meaning balanced), an AI system that may finally prove that computers can do this better than any human card player. Libratus was created by Tuomas Sandholm, a professor in the computer science department at CMU, and his graduate student Noam Brown.

mitpoker_0.jpg

The AI Poker play against the world’s best poker players. Kim is a high-stakes poker player who specializes in no-limit Texas Hold ‘Em. Jason Les and Daniel McAulay, two of the other top poker players challenging the machine, describe its play in much the same way. It does a little bit of everything,” Kim says. It doesn’t always play the same type of hand in the same way. It may bluff with a bad hand or not. It may bet high with a good hand—or not. That means Kim has trouble finding holes in its game. And if he does find a hole, it disappears the next day.

“The bot gets better and better every day. It’s like a tougher version of us,” said Jimmy Chou, one of the four pros battling Libratus. “The first couple of days, we had high hopes. But every time we find a weakness, it learns from us and the weakness disappears the next day.”

Libratus is playing thousands of games of heads-up, or two-player, no-limit Texas hold’em against several expert professional poker players. Now a little more than halfway through the 20-day contest, Libratus is up by almost $800,000 against its human opponents. So a victory, while far from guaranteed, may well be in the cards.

Regardless of the pure ability of the humans and the AI, it seems clear that the pros will be less effective as the tournament goes on. Ten hours of poker a day for 20 days straight against an emotionless computer was exhausting and demoralizing, even for pros like Doug Polk. And while the humans sleep at night, Libratus takes the supercomputer powering its in-game decision making and applies it to refining its overall strategy.

A win for Libratus would be a huge achievement in artificial intelligence. Poker requires reasoning and intelligence that has proven difficult for machines to imitate. It is fundamentally different from checkers, chess, or Go because an opponent’s hand remains hidden from view during play. In games of “imperfect information,” it is enormously complicated to figure out the ideal strategy given every possible approach your opponent may be taking. And no-limit Texas hold’em is especially challenging because an opponent could essentially bet any amount.

“Poker has been one of the hardest games for AI to crack,” says Andrew Ng, chief scientist at Baidu. “There is no single optimal move, but instead an AI player has to randomize its actions so as to make opponents uncertain when it is bluffing.”

(Sources: MitTechReview, The Verge, Wired)

Hyundai’s Exo-Skeleton Suits

Hyundai is known for it’s reasonably priced good cars, but in addition to that, the Korean automaker working on electric and hybrid cars is also researching alongside on Exo-Skeletons which will give superhuman abilities to common people in a way.

Hyundai has made a line of robotic suits to help paraplegic patients walk and to reduce back injuries in manual laborers. Workers piloting the device can lift objects weighing “hundreds of kilograms,” according to the company. Soldiers can also use it to pack up to 50 kilograms (110 pounds) over long distances. The drawback is they can be prohibitively expensive, but Hyundai thinks it can lower the cost of these exosuits that not only give us the ability to lift more, but can also help disabled people walk once again.

hyundai-exo08.jpg

The drawback is they can be prohibitively expensive, but Hyundai thinks it can lower the cost of these exosuits that can help disabled people walk once again.

 

exo2

The suit is a juiced up version of the H-LEX “wearable walking assistant” that Hyundai introduced last year. Unlike that lightweight version, which is worn like a suit, the fully mechanized exoskeleton “wears” you.

Hyundai says the project is part of its “Next Mobility” system “that will lead to the free movement of people and things.” In other words, the car manufacturer is angling the suits as transportation, where other companies, like Panasonic and Daewoo, see them strictly them strictly as worker aids. Like Hyundai, DARPA is building an exosuit for soldiers for its “Warrior Web” program. As companies like Ekso Bionics have shown, however, such robotic suits may have the highest potential as rehabilitation aids.

 

 

Sources: (Wired, Engadget)

Quantum computer memories of higher dimensions than a qubit

A quantum computer memory of higher dimensions has been created by the scientists from the Institute of Physics and Technology of the Russian Academy of Sciences and MIPT by letting two electrons loose in a system of quantum dots. In their study published in Scientific Reports, the researchers demonstrate for the first time how quantum walks of several electrons can help for implementation of quantum computation.

For more information: Quantum Computing

walking-electrons

Abstraction – Walking Electrons

“By studying the system with two electrons, we solved the problems faced in the general case of two identical interacting particles. This paves the way toward compact high-level quantum structures,” says Leonid Fedichkin, associate professor at MIPT’s Department of Theoretical Physics.

In a matter of hours, a quantum computer will be able to hack into the most popular cryptosystem used by web browsers. As far as more benevolent applications are concerned, a quantum computer would be capable of molecular modeling that accounts for all interactions between the particles involved. This, in turn, would enable the development of highly efficient solar cells and new drugs.

As it turns out, the unstable nature of the connection between qubits remains the major obstacle preventing the use of quantum walks of particles for quantum computation. Unlike their classical analogs, quantum structures are extremely sensitive to external noise. To prevent a system of several qubits from losing the information stored in it, liquid nitrogen (or helium) needs to be used for cooling. A research team led by Prof. Fedichkin demonstrated that a qubit could be physically implemented as a particle “taking a quantum walk” between two extremely small semiconductors known as quantum dots, which are connected by a “quantum tunnel.”

The Quantum dots are like potential wells to an electron, therefore, the position of an electron can be used to encode the basis of two states of the qubits 0 or 1.

elec.jpg

The blue and purple dots in the diagrams are the states of the two connected qudits (qutrits and ququarts are shown in (a) and (b) respectively). Each cell in the square diagrams on the right side of each figure (a-d) represents the position of one electron (i = 0, 1, 2, … along the horizontal axis) versus the position of the other electron (j = 0, 1, 2, … along the vertical axis). The cells color-code the probability of finding the two electrons in the corresponding dots with numbers i and j when a measurement of the system is made. Warmer colors denote higher probabilities. Credit: MIPT

If an entangled state is created between several qubits, their individual states can no longer be described separately from one another, and any valid description must refer to the state of the whole system. This means that a system of three qubits has a total of eight basis states and is in a superposition of them: A|000⟩+B|001⟩+C|010⟩+D|100⟩+E|011⟩+F|101⟩+G|110⟩+H|111⟩. By influencing the system, one inevitably affects all of the eight coefficients, whereas influencing a system of regular bits only affects their individual states. By implication, n bits can store n variables, while n qubits can store 2n variables. Qudits offer an even greater advantage since n four-level qudits (aka ququarts) can encode 4n, or 2n×2n variables. To put this into perspective, 10 ququarts store approximately 100,000 times more information than 10 bits. With greater values of n, the zeros in this number start to pile up very quickly.

In this study, Alexey Melnikov and Leonid Fedichkin obtain a system of two qudits implemented as two entangled electrons quantum-walking around the so-called cycle graph. The entanglement of the two electrons is caused by the mutual electrostatic repulsion experienced by like charges. Number of qudits can be created by connecting quantum dots in a pattern of winding paths and have more wandering electrons. The quantum walks approach to quantum computation is convenient because it is based on a natural process.

So far, scientists have been unable to connect a sufficient number of qubits for the development of a quantum computer. The work of the Russian researchers brings computer science one step closer to a future when quantum computations are commonplace.

(Source: Moscow Institute of Physics and Technology, 3Tags.)

Magic Leap – The Future?

Magic Leap is a US startup company that is founded by Rony Abovitz in 2010 and is working on a head-mounted virtual retinal display which superimposes 3D computer-generated imagery over real world objects, by projecting a digital light field into the user’s eye. It is attempting to construct a light-field chip using silicon photonics.

Before Magic Leap, a head-mounted display using light fields was already demonstrated by Nvidia in 2013, and the MIT Media Lab has also constructed a 3D display using “compressed light fields”; however Magic Leap asserts that it achieves better resolution with a new proprietary technique that projects an image directly onto the user’s retina. According to a researcher who has studied the company’s patents, Magic Leap is likely to use stacked silicon waveguides.

magic-leap-lens-system.png

Virtual reality overlaid on the real world in this manner is called mixed reality, or MR. (The goggles are semi-transparent, allowing you to see your actual surroundings.) It is more difficult to achieve than the classic fully immersive virtual reality, or VR, where all you see are synthetic images, and in many ways MR is the more powerful of the two technologies.

Magic Leap is not the only company creating mixed-reality technology, but right now the quality of its virtual visions exceeds all others. Because of this lead, money is pouring into this Florida office park. Google was one of the first to invest. Andreessen Horowitz, Kleiner Perkins, and others followed. In the past year, executives from most major media and tech companies have made the pilgrimage to Magic Leap’s office park to experience for themselves its futuristic synthetic reality.

mleap.jpeg

The video below is shot directly through the Magic Leap technology without composing any special effects. It gives us an idea of how it looks through the Magic Leap.

On December 9, 2015, Forbes reported on documents filed in the state of Delaware, indicating a Series C funding round of $827m. This funding round could bring the company’s total funding to $1.4 billion, and its post-money valuation to $3.7 billion.

On February 2, 2016, Financial Times reported that Magic Leap further raised another funding round of close to $800m, valuing the startup at $4.5 billion.

On February 11, 2016, Silicon Angle reported that Magic Leap had joined the Entertainment Software Association.

In April 2016, Magic Leap acquired Israeli cyber security company NorthBit.

Magic Leap has raised $1.4 billion from a list of investors including Google and China’s Alibaba Group.

On June 16, 2016, Magic Leap announced a partnership with Disney’s Lucasfilm and its ILMxLAB R&D unit. The two companies will form a joint research lab at Lucasfilm’s San Francisco campus.

Google’s AI Translation Tool Creates Its Own Secret Language

Google’s Neural Machine Translation system had gone live back in September. It uses deep learning to produce better, more natural translations between languages. The company’s AI team calls it the Google Neural Machine Translation system, or GNMT, and it initially provided a less resource-intensive way to ingest a sentence in one language and produce that same sentence in another language. Instead of digesting each word or phrase as a standalone unit, as prior methods do, GNMT takes in the entire sentence as a whole.

GNMT’s creators were curious about something. If you teach the translation system to translate English to Korean and vice versa, and also English to Japanese and vice versa… could it translate Korean to Japanese, without resorting to English as a bridge between them? They made this helpful gif to illustrate the idea of what they call “zero-shot translation” (it’s the orange one):

translate1.gif

As it turns out — the answer is yes! It produces “reasonable” translations between two languages that it has not explicitly linked in any way. Remember, no English allowed.

But this raised the second question. If the computer is able to make connections between concepts and words that have not been formally linked… does that mean that the computer has formed a concept of shared meaning for those words, meaning at a deeper level than simply that one word or phrase is the equivalent of another?

This can mean that the computer has developed its own internal language to represent concepts it is using to between other languages.

transcape.png

A Visualization of the translation system’s memory when translating a single sentence in multiple directions

A visualization of the translation system’s memory when translating a single sentence in multiple directions.

In some cases, Google says its GNMT system is even approaching human-level translation accuracy. That near-parity is restricted to transitions between related languages, like from English to Spanish and French. However, Google is eager to gather more data for “notoriously difficult” use cases, all of which will help its system learn and improve over time thanks to machine learning techniques. So starting today, Google is using its GNMT system for 100 percent of Chinese to English machine translations in the Google Translate mobile and web apps, accounting for around 18 million translations per day.

Google admits that its approach still has ways to go. “GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms,” Le and Schuster explain, “and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better.” Over time this will improve and it may be a lot more efficient.

 

Sources: (TechCrunch, The Verge)