Home > Uncategorized > Weekly QuEST Discussion Topics and News, 22 Apr

Weekly QuEST Discussion Topics and News, 22 Apr

QuEST 22 April 2016

We want to start this week with a discussion of a recent publication from MIT on

AI^2: Training a big data machine to defend:

Kalyan Veeramachaneni

CSAIL, MIT Cambridge, MA

Ignacio Arnaldo

PatternEx, San Jose, CA

Alfredo Cuesta-Infante, Vamsi Korrapati, Costas Bassias, Ke Li

PatternEx, San Jose, CA



  • We present an analyst-in-the-loop security system, where analyst intuition is put together with state-of- the-art machine learning to build an end-to-end active learning system.
  • The system has four key features:

–     a big data behavioral analytics platform,

–     an ensemble of outlier detection methods,

–     a mechanism to obtain feedback from security analysts,

–     and a supervised learning module.

  • When these four components are run in conjunction on a daily basis and arecompared to an unsupervised outlier detection method, detection rate improves by an average of 3.41 x, and false positives are reduced fivefold.
  • We validate our system with a real-world data set consisting of 3.6 billion log lines.
  • These results show that our system is capable of learning to defend against unseen attacks.

This AI^2 work was covered in one of the QuEST news articles from this week:



AI2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.

Image: Kalyan Veeramachaneni/MIT CSAIL

AI2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.

Image: Kalyan Veeramachaneni/MIT CSAIL

System predicts 85 percent of cyber-attacks using input from human experts

2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.” class=”CToWUd”>

Virtual artificial intelligence analyst developed by the Computer Science and Artificial Intelligence Lab and PatternEx reduces false positives by factor of 5. Watch Video

Adam Conner-Simons | CSAIL
April 18, 2016

Press Contact

Adam Conner-Simons
Email: aconner@csail.mit.edu
Phone: 617-324-9135
MIT Computer Science & Artificial Intelligence Lab



Today’s security systems usually fall into one of two categories: human or machine. So-called “analyst-driven solutions” rely on rules created by living experts and therefore miss any attacks that don’t match the rules. Meanwhile, today’s machine-learning approaches rely on “anomaly detection,” which tends to trigger false positives that both create distrust of the system and end up having to be investigated by humans, anyway.

But what if there were a solution that could merge those two worlds? What would it look like?

In a new paper, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the machine-learning startup PatternEx demonstrate an artificial intelligence platform called AI2 that predicts cyber-attacks significantly better than existing systems by continuously incorporating input from human experts. (The name comes from merging artificial intelligence with what the researchers call “analyst intuition.”)

In a new paper, researchers from MIT’s Computer Science and Artificial The team showed thatAI2 can detect 85 percent of attacks, which is roughly three times better than previous benchmarks, while also reducing the number of false positives by a factor of 5. The system was tested on 3.6 billion pieces of data known as “log lines,” which were generated by millions of users over a period of three months.

To predict attacks, AI2 combs through data and detects suspicious activity by clustering the data into meaningful patterns using unsupervised machine-learning. It then presents this activity to human analysts who confirm which events are actual attacks, and incorporates that feedback into its models for the next set of data.

“You can think about the system as a virtual analyst,” says CSAIL research scientist Kalyan Veeramachaneni, who developed AI2 with Ignacio Arnaldo, a chief data scientist at PatternEx and a former CSAIL postdoc. “It continuously generates new models that it can refine in as little as a few hours, meaning it can improve its detection rates significantly and rapidly.”

Veeramachaneni presented a paper about the system at last week’s IEEE International Conference on Big Data Security in New York City.

Creating cybersecurity systems that merge human- and computer-based approaches is tricky, partly because of the challenge of manually labeling cybersecurity data for the algorithms.

For example, let’s say you want to develop a computer-vision algorithm that can identify objects with high accuracy. Labeling data for that is simple: Just enlist a few human volunteers to label photos as either “objects” or “non-objects,” and feed that data into the algorithm.

But for a cybersecurity task, the average person on a crowdsourcing site like AmazonMechanical Turk simply doesn’t have the skillset to apply labels like “DDOS” or “exfiltration attacks,” says Veeramachaneni. “You need security experts.”

That opens up another problem: Experts are busy, and they can’t spend all day reviewing reams of data that have been flagged as suspicious. Companies have been known to give up on platforms that are too much work, so an effective machine-learning system has to be able to improve itself without overwhelming its human overlords.

AI2’s secret weapon is that it fuses together three different unsupervised-learning methods, and then shows the top events to analysts for them to label. It then builds a supervised model that it can constantly refine through what the team calls a “continuous active learning system.”

Specifically, on day one of its training, AI2 picks the 200 most abnormal events and gives them to the expert. As it improves over time, it identifies more and more of the events as actual attacks, meaning that in a matter of days the analyst may only be looking at 30 or 40 events a day.

This week we also want to revisit the new QuEST framework if time permits.  The motivation for the twist we proposed two weeks ago was associated with some of our colleagues making advances in ‘generative’ deep learning models.  Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with asequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

Alec Radford & Luke Metz
indico Research
Boston, MA
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (9)

Categories: Uncategorized
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: