Chip Flaws: Spectre and Meltdown Vulnerabilities

Processors are of crucial importance in this digital age as their vitality in this computational era is unparalleled. The device you are reading this blog on and your smartwatch you see your time on, every device has a processor. These processors run the processes that are essential to show you your notification, run an application, play games as well as check some emails. As they run all the essential processes on your computer, these silicon chips handle extremely sensitive data. That includes passwords and encryption keys, the fundamental tools for keeping your computer secure.

The Spectre and Meltdown vulnerabilities, revealed a few days before could let attackers capture the information they shouldn’t be able to access, like your passwords and keys. As a result, an attack on a computer chip can turn into a serious security concern.



Meltdown and Spectre


So what’s Spectre?

Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim’s confidential information via a side channel to the adversary. To make computer processes run faster, a chip will essentially guess what information the computer needs to perform its next function. That’s called speculative execution. As the chip guesses, that sensitive information is momentarily easier to access. In brief, Spectre is a vulnerability with implementations of branch prediction that affects modern microprocessors with speculative execution. Spectre is a vulnerability that forces programs on a user’s operating system to access an arbitrary location in the program’s memory space.

The Spectre paper displays the attack in four essential steps:

  1. First, it shows that branch prediction logic in modern processors can be trained to reliably hit or miss based on the internal workings of a malicious program.
  2. It then goes on to show that the subsequent difference between cache hits and misses can be reliably timed so that what should have been a simple non-functional difference can, in fact, be subverted into a covert channel which extracts information from an unrelated process’s inner workings.
  3. Thirdly, the paper synthesizes the results with return-oriented programming exploits and other principles with a simple example program and a JavaScript snippet run under a sandboxing browser; in both cases, the entire address space of the victim process (i.e. the contents of a running program) is shown to be readable by simply exploiting speculative execution of conditional branches in code generated by a stock compiler or the JavaScript machinery present in an extant browser.
  4. Finally, the paper concludes by generalizing the attack to any non-functional state of the victim process. It briefly discusses even such highly non-obvious non-functional effects as bus arbitration latency.

And What’s Meltdown?

In this form of attack, the chip is fooled into loading secured data during a speculation window in such a way that it can later be viewed by an unauthorized attacker. The attack relies upon a commonly-used, industry-wide practice that separates loading in-memory data from the process of checking permissions. Again, the industry’s conventional wisdom operated under the assumption that the entire speculative execution process was invisible, so separating these pieces wasn’t seen as a risk.

In Meltdown, a carefully crafted branch of code first arranges to execute some attack code speculatively. This code loads some secure data to which the program doesn’t ordinarily have access. Because it’s happening speculatively, the permission check on that access will happen in parallel (and not fail until the end of the speculation window), and as a consequence special internal chip memory known as a cache becomes loaded with the privileged data. Then, a carefully constructed code sequence is used to perform other memory operations based upon the value of the privileged data. While the normally observable results of these operations aren’t visible following the speculation (which ultimately is discarded), a technique known as cache side-channel analysis can be used to determine the value of the secure data.

The basic difference between Spectre and Meltdown is that Spectre can be used to manipulate a process into revealing its own data. On the other hand, Meltdown can be used to read privileged memory in a process’s address space which even the process itself would normally be unable to access (on some unprotected OS’s this includes data belonging to the kernel or other processes).

(via Wiki, cnet, spectreattack, meltdownattack, redhat, wired)




Capsule Nets

A few months ago, Geoffrey Hinton and his team published two papers that introduced a completely new type of a neural network based on Capsules, further to in support of those Capsule Networks, the team published an algorithm called dynamic routing between capsules for the training of such networks.

With Hinton’s capsule network, layers are comprised not of individual Artificial Neural Networks (ANNs), but rather of small groups of ANNs arranged in functional pods, or “capsules.” Each capsule is programmed to detect a particular attribute of the object being classified, thus getting around the need for massive input data sets. This makes capsule networks a departure from the “let them teach themselves” approach of traditional neural nets.

A layer is assigned the task of verifying the presence of some characteristic, and when enough capsules are in agreement on the meaning of their input data, the layer passes on its prediction to the next layer.



Capsule Net Architecture


A capsule is a nested set of neural layers. So in a regular neural network, you keep on adding more layers. In CapsNet you would add more layers inside a single layer. Or in other words, nesting a neural layer inside another. The state of the neurons inside a capsule capture the above properties of one entity inside an image. A capsule outputs a vector to represent the existence of the entity. The orientation of the vector represents the properties of the entity. The vector is sent to all possible parents in the neural network. For each possible parent, a capsule can find a prediction vector. Prediction vector is calculated based on multiplying its own weight and a weight matrix. Whichever parent has the largest scalar prediction vector product, increases the capsule bond. Rest of the parents decrease their bond. This routing by agreement method is superior to the current mechanism like max-pooling. Max pooling routes based on the strongest feature detected in the lower layer. Apart from dynamic routing, CapsNet talks about adding squashing to a capsule. Squashing is a non-linearity. So instead of adding squashing to each layer like how you do in CNN, you add the squashing to a nested set of layers. So the squashing function gets applied to the vector output of each capsule.

So far, capsule nets have proven equally adept at as traditional neural nets at understanding handwriting, and cut the error rate in half for identifying toy cars and trucks. Impressive, but it’s just a start. The current implantation of capsule networks is, according to Hinton, slower than it will have to be in the end.


(via arxiv, medium blogs, i-programmer, bigthink)

AI Helps in Making New Materials

The machine-learning system finds patterns in materials “recipes,” even when training data is lacking.  In the previous month, 3 of the MIT material scientists and their colleagues published a paper describing a new artificial intelligence system that can pore through scientific papers and extract “recipes” for producing particular types of materials.

The researchers envision a database that contains materials recipes extracted from millions of papers. Scientists and engineers could enter the name of a target material and any other criteria — precursor materials, reaction conditions, fabrication processes — and pull up suggested recipes. For instance, the new system was able to identify correlations between “precursor” chemicals used in materials recipes and the crystal structures of the resulting products. The same correlations, it turned out, had been documented in the literature.

The system also relies on statistical methods that provide a natural mechanism for generating original recipes. In the paper, the researchers use this mechanism to suggest alternative recipes for known materials, and the suggestions accord well with real recipes.

Like many of the best-performing artificial-intelligence systems of the past 10 years, the MIT researchers’ new system is a so-called neural network, which learns to perform computational tasks by analyzing huge sets of training data. Traditionally, attempts to use neural networks to generate materials recipes have run up against two problems, which the researchers describe as sparsity and scarcity.

Any recipe for a material can be represented as a vector, which is essentially a long string of numbers. Each number represents a feature of the recipe, such as the concentration of a particular chemical, the solvent in which it’s dissolved, or the temperature at which a reaction takes place.  To test the system’s accuracy, however, they had to rely on the labeled data, since they had no criterion for evaluating its performance on the unlabeled data. In those tests, the system was able to identify with 99 percent accuracy the paragraphs that contained recipes and to label with 86 percent accuracy the words within those paragraphs.

The research was supported by the National Science Foundation, the Natural Sciences and Engineering Research Council of Canada, the U.S. Office of Naval Research, the MIT Energy Initiative, and the U.S. Department of Energy’s Basic Energy Science Program.


(via MitTechReview, MitNews, Wiki)

Neural Networks Are Learning What (Not) To Forget

Deep Learning, which makes use of various types of Artificial Neural Networks, is the paradigm that has the potential to grow AI at an exponential rate. There’s a lot of hype going on around Neural Nets and people are busy in researching and making them useful to their potential.

The title of this blog might seem obscure to some of the readers but consider an example, humans have evolved from apes, and while evolving, we’ve forgotten some information or knowledge or a particular experience which is no longer useful or needed. In particular, humans have the extraordinary ability to constantly update their memories with the most important knowledge while overwriting information that is no longer useful. The world is full of trivial experiences or provides knowledge which is irrelevant or does little help to us, hence we forget those things or rather ignore those incidences.

The same cannot be said of machines. Any skill they learn is quickly overwritten, regardless of how important it is. There is currently no reliable mechanism they can use to prioritize these skills, deciding what to remember and what to forget. Thanks to Rahaf Aljundi and pals at the University of Leuven in Belgium and at Facebook AI Research. These guys have shown that the approach biological systems use to learn and to forget, can work with artificial neural networks too.

The key is a process known as Hebbian learning, first proposed in the 1940s by the Canadian psychologist Donald Hebb to explain the way brains learn via synaptic plasticity. Hebb’s theory can be famously summarized as “Cells that fire together wire together.”

What is Hebbian Learning?

Hebbian learning is one of the oldest learning algorithms and is based in large part on the dynamics of biological systems. A synapse between two neurons is strengthened when the neurons on either side of the synapse (input and output) have highly correlated outputs. In essence, when an input neuron fires, if it frequently leads to the firing of the output neuron, the synapse is strengthened. Following the analogy to an artificial system, the tap weight is increased with high correlation between two sequential neurons.

hebbian learning

Artificial neural nets being taught what to forget

In other words, the connections between neurons grow stronger if they fire together, and these connections are therefore more difficult to break. This is how we learn—repeated synchronized firing of neurons makes the connections between them stronger and harder to overwrite. The team did this by measuring the outputs from a neural network and monitoring how sensitive they are to changes in the connections within the network. This gave them a sense of which network parameters are most important and should, therefore, be preserved. “When learning a new task, changes to important parameters are penalized,” said the team. They say the resulting network has “memory aware synapses.”

Neural networks with memory aware synapses turn out to perform better in these tests than other networks. In other words, they preserve more of the original skill than networks without this ability, although the results certainly allow room for improvement.

The Paper: : Memory Aware Synapses: Learning What (Not) To Forget.

(via: MitTechReview, Wikipedia, arxiv)

Evolution of Poker with AI

Artificial Intelligence is happening all around us. With many businesses integrating AI in their technologies, there is an outburst of AI everywhere. From self-driving cars to GO playing AI, and most surprising of all, Poker! A few months ago, I had shared a blog on The Poker Playing AI (recommended read!). Well yeah, to those folks who didn’t know, AI has also kept its feet in Poker and it’s a major achievement!

To summarize the victory of AI in poker, a beautiful infographic has been created and designed meticulously by the folks at pokersites.



Source of the Infographic ->

What is Ethereum?

With all the hype going around Blockchain and it’s infrastructure users like Bitcoin and Ethereum are all over the media and magazines I had to write a blog on Ethereum. Briefly, Blockchain is to Bitcoin, what the internet is to email. A big electronic system, on top of which you can build applications.

Then what’s Ethereum? It is a public database that keeps a permanent record of digital transactions. Importantly, this database doesn’t require any central authority to maintain and secure it.


Like Bitcoin, Ethereum is a distributed public blockchain network. Although there are some significant technical differences between the two, the most important distinction to note is that Bitcoin and Ethereum differ substantially in purpose and capability. Bitcoin offers one particular application of blockchain technology, a peer to peer electronic cash system that enables online Bitcoin payments. While the Bitcoin blockchain is used to track ownership of digital currency (bitcoins), the Ethereum blockchain focuses on running the programming code of any decentralized application.

In the Ethereum blockchain, instead of mining for bitcoin, miners work to earn Ether, a type of crypto token that fuels the network. Beyond a tradeable cryptocurrency, Ether is also used by application developers to pay for transaction fees and services on the Ethereum network.

The Ethereum blockchain is essentially a transaction-based state machine, which means that on a series of specific inputs, it will transition to a new state. The first state in Ethereum is the genesis state, which means a blank state, wherein no transactions have happened on the network. After a transaction occurs, the genesis state transitions to a new state, possibly a final state denoting the current state of Ethereum. Of course, there are millions of transactions occurring concurrently, whereby these transactions are grouped in Blocks. So to simplify, a Block contains a series of transactions.

To cause a transition from one state to the next, a transaction must be valid. For a transaction to be considered valid, it must go through a validation process known as miningMining is when a group of nodes (i.e. computers) expend their compute resources to create a block of valid transactions. In the most basic sense, a transaction is a cryptographically signed piece of instruction that is generated by an externally owned account, serialized, and then submitted to the blockchain.

Any node on the network that declares itself as a miner can attempt to create and validate a block. Lots of miners from around the world try to create and validate blocks at the same time. Each miner provides a mathematical “proof” when submitting a block to the blockchain, and this proof acts as a guarantee: if the proof exists, the block must be valid.

For a block to be added to the main blockchain, the miner must prove it faster than any other competitor miner. The process of validating each block by having a miner provide a mathematical proof is known as a “proof of work.”

A miner who validates a new block is rewarded with a certain amount of value for doing this work. What is that value? The Ethereum blockchain uses an intrinsic digital token called “Ether.” The value token of the Ethereum blockchain is called ether. It is listed under the code ETH and traded on cryptocurrency exchanges. It is also used to pay for transaction fees and computational services on the Ethereum network. Every time a miner proves a block, new Ether tokens are generated and awarded. Ether can be transferred between accounts and used to compensate participant nodes for computations performed.

EVM – Ehtereum Virtual Machine

Ethereum is a programmable blockchain. Rather than give users a set of pre-defined operations (e.g. bitcoin transactions), Ethereum allows users to create their own operations of any complexity they wish. In this way, it serves as a platform for many different types of decentralized blockchain applications, including but not limited to cryptocurrencies.

Ethereum in the narrow sense refers to a suite of protocols that define a platform for decentralized applications. At the heart of it is the Ethereum Virtual Machine (“EVM”), which can execute code of arbitrary algorithmic complexity. In computer science terms, Ethereum is “Turing complete”. Developers can create applications that run on the EVM using friendly programming languages modeled on existing languages like JavaScript and Python.

Smart Contracts

Smart contracts are deterministic exchange mechanisms controlled by digital means that can carry out the direct transaction of value between untrusted agents. They can be used to facilitate, verify, and enforce the negotiation or performance of economically-laden procedural instructions and potentially circumvent censorship, collusion, and counter-party risk. In Ethereum, smart contracts are treated as autonomous scripts or stateful decentralized applications that are stored in the Ethereum blockchain for later execution by the EVM. Instructions embedded in Ethereum contracts are paid for in ether (or more technically “gas”) and can be implemented in a variety of Turing complete scripting languages.

There’s still a lot more to Ethereum, but this blog will help you get some insights before delving in deeper in the utopian world of blockchain and cryptocurrency.


(sources: Wiki, Telegraph, Medium, ethdocs)

Quantum Network

We all know the hype that’s currently running around Quantum Computer, their working and a plethora of discoveries happening in Quantum Computing. Recommended: Quantum ComputingQuantum Computer Memories.

Recently, there has been news about a Quantum Network or rather, Quantum Internet. So what’s Quantum Network? Quantum networks form an important element of quantum computing and quantum communication systems. In general, quantum networks allow for the transmission of quantum information (quantum bits, also called qubits), between physically separated quantum processors. A quantum processor is a small quantum computer being able to perform quantum logic gates on a certain number of qubits.


Being able to send qubits from one quantum processor to another allows them to be connected to form a quantum computing cluster. This is often referred to as networked quantum computing or distributed quantum computing. Here, several less powerful quantum processors are connected together by a quantum network to form one much more powerful quantum computer. This is analogous to connecting several classical computers to form a computer cluster in classical computing. Networked quantum computing offers a path towards scalability for quantum computers since more and more quantum processors can naturally be added over time to increase the overall quantum computing capabilities. In networked quantum computing, the individual quantum processors are typically separated only by short distances.

Going back a year, Chinese physicists launched the world’s first quantum satellite. Unlike the dishes that deliver your Television shows, this 1,400-pound behemoth doesn’t beam radio waves. Instead, the physicists designed it to send and receive bits of information encoded in delicate photons of infrared light. It’s a test of a budding technology known as quantum communications, which experts say could be far more secure than any existing info relay system. If quantum communications were like mailing a letter, entangled photons are kind of like the envelope: They carry the message and keep it secure. Jian-Wei Pan of the University of Science and Technology of China, who leads the research on the satellite, has said that he wants to launch more quantum satellites in the next five years.

The basic structure of a quantum network and more generally a quantum internet is analogous to classical networks. First, we have end nodes on which applications can ultimately be run. These end nodes are quantum processors of at least one qubit. Some applications of a quantum internet require quantum processors of several qubits as well as a quantum memory at the end nodes.

Second, to transport qubits from one node to another, we need communication lines. For the purpose of quantum communication, standard telecom fibers can be used. For networked quantum computing, in which quantum processors are linked at short distances, one typically employs different wavelength depending on the exact hardware platform of the quantum processor.

Third, to make maximum use of communication infrastructure, one requires optical switches capable of delivering qubits to the intended quantum processor. These switches need to preserve quantum coherence, which makes them more challenging to realize than standard optical switches.

Finally, to transport qubits over long distances one requires a quantum repeater. Since qubits cannot be copied, classical signal amplification is not possible and a quantum repeater works in a fundamentally different way than a classical repeater.

The quantum internet could also be useful for potential quantum computing schemes, says Fu. Companies like Google and IBM are developing quantum computers to execute specific algorithms faster than any existing computer. Instead of selling people personal quantum computers, they’ve proposed putting their quantum computer in the cloud, where users would log into the quantum computer via the internet. While running their computations, they might want to transmit quantum-encrypted information between their personal computer and the cloud-based quantum computer. “Users might not want to send their information classically, where it could eavesdrop,” Fu says. But still, this technology is almost 13 years away and surely we will witness a lot more discoveries in the coming years.

(source : MitTechReview, Wikipedia, Wired)




DCGANs stand for Deep Convolutional Generative Adversarial Networks. It is quite the contrary to a Convolutional Neural Network (CNN). It works in an opposite direction compared to a CNN. What CNN does is that it transforms an image to class labels, that is a list of probabilities, whereas DCGAN generates an image from random parameters.



Some of you might wonder what are convolutions. Convolutions are the operations or the modifications we perform on an image. We perform modifications on the image kernel by multiplying it with the matrix of the operation we want to perform. For detailed information on image kernel and convolutions please visit here.

Moving on, what CNN does is it applies a lot of filters to extract various features from a single image. CNN applies multi-layered filters to a single image to extract features moving deeper into the layers.


Now the typical working on CNNs is that it starts from a single RGB image on the right, multiple filtering layers are applied to produce smaller and large number of images.


Flow of CNNs


Image generation flow of DCGANs


Flow of DCGANs


Now, the filters that we previously mentioned are convolutional in CNNs and transposed-convolutional in DCGANs, both of them work in the opposite direction.


Illustration of their working

Convolution: (Bottom Up) 3×3 blue pixels contribute to generating a single green pixel. Each of 3×3 blue pixels multiplied by the corresponding filter value, and the results from different blue pixels summed up to be a single green pixel.

Transposed-Convolutions: (Top Down) A single green pixel contributes to generating 3×3 blue pixels. Each green pixel is multiplied by each 3×3 filter values and the results from different green pixels are summed up to be a single blue pixel.


It is suggested that the input parameters could use a semantic structure as in the following example-


Interpretation of Input Parameters


Training Strategies:

CNN: Classifying authentic and fake images. Here, authentic images are provided as training data to the model.

DCGAN: It is trained to generate images classified as authentic by the CNN. It works by trying to fool the CNN, DCGAN learns to generate images similar to the training data.


What is Meta-Learning in Machine Learning?

Meta-Learning is a subfield of machine learning where automatic learning algorithms are applied on meta-data. In brief, it means Learning to Learn. The main goal is to use meta-data to understand how automatic learning can become flexible in solving different kinds of learning problems, hence to improve the performance of existing learning algorithms. Which means that how effectively we can increase the learning rate of our algorithms.

Meta-Learning affects the hypothesis space for the learning algorithm by either:

  • Changing the hypothesis space of the learning algorithms (hyper-parameter tuning, feature selection)
  • Changing the way the hypothesis space is searched by the learning algorithms (learning rules)

Variations of Meta-Learning: 

  • Algorithm Learning (selection) – Select learning algorithms according to the characteristics of the instance.
  • Hyper-parameter Optimization – Select hyper-parameters for learning algorithms. The choice of the hyper-parameters influences how well you learn.
  • Ensemble Methods – Learn to learn “collectively” – Bagging, Boosting, Stacked Generalization.

Flexibility is very important because each learning algorithm is based on a set of assumptions about the data, its inductive bias. This means that it will only learn well if the bias matches the data in the learning problem. A learning algorithm may perform very well on one learning problem, but very badly on the next. From a non-expert point of view, this poses strong restrictions on the use of machine learning or data mining techniques, since the relationship between the learning problem (often some kind of database) and the effectiveness of different learning algorithms is not yet understood.

By using different kinds of meta-data, like properties of the learning problem, algorithm properties (like performance measures), or patterns previously derived from the data, it is possible to select, alter or combine different learning algorithms to effectively solve a given learning problem. Critiques of meta-learning approaches bear a strong resemblance to the critique of metaheuristic, which can be said to be a related problem.

Metalearning may be the most ambitious but also the most rewarding goal of machine learning. There are few limits to what a good meta-learner will learn. Where appropriate it will learn to learn by analogy, by chunking, by planning, by subgoal generation, etc.

OpenAI’s Virtual Wrestling Bots

OpenAI, a firm backed by Elon Musk, has currently revealed one of it’s latest developments in the fields of Machine Learning, demonstrated using the technology of virtual sumo wrestlers.


These are the bots inside the virtual world of RoboSumo controlled my machine learning. They (The Bots) taught themselves through trial and error using Reinforcement Learning, a technique inspired by the way animals learn through feedback. It has proved useful for training computers to play games and to control robots. The virtual wrestlers might look slightly ridiculous, but they are using a very clever approach to learning in a fast-changing environment while dealing with an opponent. This game and it’s virtual world were created at OpenAI to show how forcing AI systems to compete can spur them to become more intelligent.

However, one of the disadvantages of reinforcement learning is that doesn’t work well in realistic situations, or where things are more dynamic. OpenAI devised a solution to this problem by creating its own reinforcement algorithm called proximal policy optimization (PPO), which is especially well suited to changing environments.

The latest work, done in collaboration with researchers from Carnegie Mellon University and UC Berkeley, demonstrates a way for AI agents to apply what the researchers call a “meta-learning” framework. This means the agents can take what they have already learned and apply it to a new situation.

Inside the RoboSumo environment (see video above), the agents started out behaving randomly. Through thousands of iterations of trial and error, they gradually developed the ability to move—and, eventually, to fight. Through further iterations, the wrestlers developed the ability to avoid each other, and even to question their own actions. This learning happened on the fly, with the agents adapting even they wrestled each other.

Flexible learning is a very important part of human intelligence, and it will be crucial if machines are going to become capable of performing anything other than very narrow tasks in the real world. This kind of learning is very difficult to implement in machines, and the latest work is a small but significant step in that direction.


(sources: MitTechReview, OpenAI Blog, Wired)