Weekly QuEST Discussion Topics and News, 20 May

QuEST 20 May 2016

This week we finally revisit the new QuEST framework.  The original motivation for the twist we proposed several weeks ago was associated with some of our colleagues making advances in ‘generative’ deep learning models.  But we have to discuss a recent article in the Atlantic that captures the thoughts of Prof Hoffman on consciousness – but independent of the holes in his formulation/use of quantum at macro scale some of the points he is focused on are consistent with our QuEST views of consciousness – they are highlighted in Carolina Blue below:

http://www.theatlantic.com/science/archive/2016/04/the-illusion-of-reality/479559

Case Against Reality

A professor of cognitive science argues that the world is nothing like the one we experience through our senses.

David McNew / Quanta

As we go about our daily lives, we tend to assume that our perceptions—sights, sounds, textures, tastes—are an accurate portrayal of the real world. Sure, when we stop and think about it—or when we findourselves fooled by a perceptual illusion—we realize with a jolt that what we perceive is never the world directly, but rather our brain’s best guess at what that world is like, a kind of internal simulationof an external reality. ** we might not worry as much as he does about reality – we focus on the fitness functions of stability/consistency and usefulness versus reality ~ awareness **  Still, we bank on the fact that our simulation is a reasonably decent one. If it wasn’t, wouldn’t evolution have weeded us out by now? The true reality might be forever beyond our reach, but surely our senses give us at least an inkling of what it’s really like.

Not so, says Donald D. Hoffman, a professor of cognitive science at the University of California, Irvine. Hoffman has spent the past three decades studying perception, artificial intelligence, evolutionary game theory and the brain, and his conclusion is a dramatic one: The world presented to us by our perceptions isnothing like reality. What’s more, he says, we have evolution itself to thank for this magnificent illusion, as it maximizes evolutionary fitness by driving truth to extinction.

Getting at questions about the nature of reality, and disentangling the observer from the observed, is an endeavor that straddles the boundaries of neuroscience and fundamental physics. On one side you’ll findresearchers scratching their chins raw trying to understand how a three-pound lump of gray matter obeying nothing more than the ordinary laws of physics can give rise to first-person conscious experience. This is the aptly named “hard problem.”

On the other side are quantum physicists, marveling at the strange fact that quantum systems don’t seem to be definite objects localized in space until we come along to observe them. Experiment after experiment has shown—defying common sense—that if we assume that the particles that make up ordinary objects have an objective, observer-independent existence, we get the wrong answers. ** we put this in our definition of qualia/situations – all from an agent centric perspective ** The central lesson of quantum physics is clear:There are no public objects sitting out there in some preexisting space. As the physicist John Wheeler put it, “Useful as it is under ordinary circumstances to say that the world exists ‘out there’ independent of us, that view can no longer be upheld.”

We will take this news article and attempt to take the discussion down the path of generative models – and how we can use those tools to generate this ‘stable, consistent and useful perception we call consciousness – from our QuEST persepctive’.

Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system which was generative.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with asequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework – to specifically address what we’ve termed the ‘yellow-frog’ issue.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

The next topic in the lineup is I want to hit the Deep Speech 2 system by Baidu.

Deep Speech 2: End-to-End Speech Recognition in
English and Mandarin
Baidu Research – Silicon Valley AI Lab

arXiv:1512.02595v1 [cs.CL] 8Dec 2015

  • We show that an end-to-end deep learning approach can be used to recognizeeither English or Mandarin Chinese speech—two vastly different languages.
  • Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages.
  • Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system [26].
  • Because of this efficiency, experiments that previously took weeks now run in days.
  • This enables us to iterate more quickly to identify superior architectures and algorithms.
  • As a result, in several cases, our system is competitive with the transcription of human workers when benchmarked on standard datasets.
  • Finally, using a technique called Batch Dispatch with GPUs in the data center, we show that our system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.
  • news summary (13)

Also as a heads up for this week, please see the attached flyer from Dr. John Flach for a workshop next week at WSU.  If you are interested please reply to him ASAP so they can get a head count for seats and food.

WSU_codesign_flyer_final.ppt

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 11 May

This meeting will be an open wide-ranging discussion – for participants in person or on the phone we will review QuEST  goals (see the What is QuEST document 2016 – cut and pasted below) and some details for people to chime in where their interest lie in advancing the goals of QuEST – we will have present in the discussion a program manager from AFOSR, an expert in machine based cognition, who can help with the discussion and it also is an opportunity for those who are currently being funded by AFOSR to explain where their work fits in the QuEST context and those who would like to have AFOSR new starts where that work would fit.

The discussion will also formulate the basis of a talk Cap has to give up at AFOSR in late May on QuEST and a deck of slides we promised to DARPA for their use in funding QuEST related efforts.

 

QUalia Exploitation of Sensing Technology (QuEST) – Cognitive Exoskeleton

 

PURPOSE

 

– QuEST is an innovative analytical and software development approach to improve human-machine team decision quality over a wide range of stimuli (handling unexpected queries) by providing computer-based decision aids that are engineered to provide both intuitive reasoning and “conscious” deliberative thinking.

 

– QuEST provides a mathematical framework to understand what can be known by a group of people and their computer-based decision aids about situations to facilitate prediction of when more people (different training) or computer aids are necessary to make a particular decision.

 

 

DISCUSSION

 

– QuEST defines a new set of processes that will be implemented in computer agents.

 

– Decision quality is dominated by the appropriate level of situation awareness.  Situation awareness is the perception of environmental elements with respect to time/space, logical connection, comprehension of their meaning, and the projection of their future status.

 

– QuEST is an approach to situation assessment (processes that are used to achieve situation awareness) and situation understanding (comprehension of the meaning of the information) integrated with each other and the decision maker’s goals.

 

– QuEST solutions help humans understand the “so what” of the data {sensemaking ~ “a motivated, continuous effort to understand connections (among people, places and events) in order to anticipate their trajectories and act effectively” for decision quality performance}.1

 

– QuEST agents implement blended dual process cognitive models (have both artificial conscious and artificial subconscious/intuition processes) for situation assessment.

 

— Artificial conscious processes implement in working memory the QuEST Theory of Consciousness (structural coherence, situation based, simulation/cognitively decoupled).

 

— Subconscious/intuition processes do not use working memory and are thus considered autonomous (do not require consciousness to act) – current approaches to data driven, artificial intelligence provide a wide range of options for implementing instantiations of capturing experiential knowledge used by these processes.

 

– QuEST is developing a ‘Theory of Knowledge’ to provide the theoretical foundations to understand what an agent or group of agents can know, which fundamentally changes human-computer decision making from an empirical effort to a scientific effort.

 

Dr. Steven Rogers/AFRL/RY/RI/528-8838/5 Jan 16/Distribution A : Cleared for Release (88ABW-2016-0324)

1 Klein, G., Moon, B. and Hoffman, R.R., “Making Sense of Sensemaking I: Alternative Perspectives,” IEEE Intelligent Systems, 21(4), Jul/Aug 2006, pp. 70-73.

news summary (11)

Categories: Uncategorized

What Matters – some notes from Dr. Flach

April 28, 2016 Leave a comment
See below for some notes in response to this week’s announcement from Dr. Flach, who recently authored a book titled ‘What Matters’, which you can download for free from the link below.
I think you will find much overlap between what Hoffman is saying and our book “What Matters.”
However, we would not frame it as a case ‘against reality’ but rather for the fact that experience is the only reality.  That is,
there is no reality independent of experience.  Experience isn’t the “illusion,”  rather the idea of an absolute
reality or truth independent from experience is the illusion. This is fundamental to quantum mechanics.
In chapter 2 of the book we talk about the surprise version of the 20 questions game that John Wheeler (physicist mentioned in article)
used to illustrate the fact that reality does not exist “out there” independent of our experiences. 
Note that although Hoffman has a computational model consistent with this view – the view is not new – it is exactly what William James
argued for in his Radical Empiricism (100 years ago) and also what Pirsig later described as a Metaphysics of Quality.  (note the connections with qualia).
There is also a link in the attached flyer.
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 29 Apr

April 28, 2016 Leave a comment

QuEST 29 April 2016

This week we finally revisit the new QuEST framework.  The original motivation for the twist we proposed several weeks ago was associated with some of our colleagues making advances in ‘generative’ deep learning models.  But we have to discuss a recent article in the Atlantic that captures the thoughts of Prof Hoffman on consciousness:

http://www.theatlantic.com/science/archive/2016/04/the-illusion-of-reality/479559

Case Against Reality

A professor of cognitive science argues that the world is nothing like the one we experience through our senses.

David McNew / Quanta

We noticed that you have an

AD BLOCKER

ENABLED

Please consider disabling it for our site, or supporting our work in one of these ways

As we go about our daily lives, we tend to assume that our perceptions—sights, sounds, textures, tastes—are an accurate portrayal of the real world. Sure, when we stop and think about it—or when we findourselves fooled by a perceptual illusion—we realize with a jolt that what we perceive is never the world directly, but rather our brain’s best guess at what that world is like, a kind of internal simulationof an external reality. Still, we bank on the fact that our simulation is a reasonably decent one. If it wasn’t, wouldn’t evolution have weeded us out by now? The true reality might be forever beyond our reach, but surely our senses give us at least an inkling of what it’s really like.

Not so, says Donald D. Hoffman, a professor of cognitive science at the University of California, Irvine. Hoffman has spent the past three decades studying perception, artificial intelligence, evolutionary game theory and the brain, and his conclusion is a dramatic one: The world presented to us by our perceptions isnothing like reality. What’s more, he says, we have evolution itself to thank for this magnificent illusion, as it maximizes evolutionary fitness by driving truth to extinction.

Getting at questions about the nature of reality, and disentangling the observer from the observed, is an endeavor that straddles the boundaries of neuroscience and fundamental physics. On one side you’ll findresearchers scratching their chins raw trying to understand how a three-pound lump of gray matter obeying nothing more than the ordinary laws of physics can give rise to first-person conscious experience. This is the aptly named “hard problem.”

On the other side are quantum physicists, marveling at the strange fact that quantum systems don’t seem to be definite objects localized in space until we come along to observe them. Experiment after experiment has shown—defying common sense—that if we assume that the particles that make up ordinary objects have an objective, observer-independent existence, we get the wrong answers. The central lesson of quantum physics is clear: There are no public objects sitting out there in some preexisting space. As the physicist John Wheeler put it, “Useful as it is under ordinary circumstances to say that the world exists ‘out there’ independent of us, that view can no longer be upheld.”

We will take this news article and attempt to take the discussion down the path of generative models – and how we can use those tools to generate this ‘stable, consistent and useful perception we call consciousness – from our QuEST persepctive’.

Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system which was generative.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with asequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework – to specifically address what we’ve termed the ‘yellow-frog’ issue.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

The next topic in the lineup is I want to hit the Deep Speech 2 system by Baidu.

Deep Speech 2: End-to-End Speech Recognition in
English and Mandarin
Baidu Research – Silicon Valley AI Lab

arXiv:1512.02595v1 [cs.CL] 8Dec 2015

  • We show that an end-to-end deep learning approach can be used to recognizeeither English or Mandarin Chinese speech—two vastly different languages.
  • Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages.
  • Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system [26].
  • Because of this efficiency, experiments that previously took weeks now run in days.
  • This enables us to iterate more quickly to identify superior architectures and algorithms.
  • As a result, in several cases, our system is competitive with the transcription of human workers when benchmarked on standard datasets.
  • Finally, using a technique called Batch Dispatch with GPUs in the data center, we show that our system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

news summary (10)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 22 Apr

April 21, 2016 Leave a comment

QuEST 22 April 2016

We want to start this week with a discussion of a recent publication from MIT on

AI^2: Training a big data machine to defend:

Kalyan Veeramachaneni

CSAIL, MIT Cambridge, MA

Ignacio Arnaldo

PatternEx, San Jose, CA

Alfredo Cuesta-Infante, Vamsi Korrapati, Costas Bassias, Ke Li

PatternEx, San Jose, CA

 

Abstract:

  • We present an analyst-in-the-loop security system, where analyst intuition is put together with state-of- the-art machine learning to build an end-to-end active learning system.
  • The system has four key features:

–     a big data behavioral analytics platform,

–     an ensemble of outlier detection methods,

–     a mechanism to obtain feedback from security analysts,

–     and a supervised learning module.

  • When these four components are run in conjunction on a daily basis and arecompared to an unsupervised outlier detection method, detection rate improves by an average of 3.41 x, and false positives are reduced fivefold.
  • We validate our system with a real-world data set consisting of 3.6 billion log lines.
  • These results show that our system is capable of learning to defend against unseen attacks.

This AI^2 work was covered in one of the QuEST news articles from this week:

 

http://news.mit.edu/2016/ai-system-predicts-85-percent-cyber-attacks-using-input-human-experts-0418

AI2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.

Image: Kalyan Veeramachaneni/MIT CSAIL

AI2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.

Image: Kalyan Veeramachaneni/MIT CSAIL

System predicts 85 percent of cyber-attacks using input from human experts

2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.” class=”CToWUd”>

Virtual artificial intelligence analyst developed by the Computer Science and Artificial Intelligence Lab and PatternEx reduces false positives by factor of 5. Watch Video

Adam Conner-Simons | CSAIL
April 18, 2016

Press Contact

Adam Conner-Simons
Email: aconner@csail.mit.edu
Phone: 617-324-9135
MIT Computer Science & Artificial Intelligence Lab

Share

Comment

Today’s security systems usually fall into one of two categories: human or machine. So-called “analyst-driven solutions” rely on rules created by living experts and therefore miss any attacks that don’t match the rules. Meanwhile, today’s machine-learning approaches rely on “anomaly detection,” which tends to trigger false positives that both create distrust of the system and end up having to be investigated by humans, anyway.

But what if there were a solution that could merge those two worlds? What would it look like?

In a new paper, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the machine-learning startup PatternEx demonstrate an artificial intelligence platform called AI2 that predicts cyber-attacks significantly better than existing systems by continuously incorporating input from human experts. (The name comes from merging artificial intelligence with what the researchers call “analyst intuition.”)

In a new paper, researchers from MIT’s Computer Science and Artificial The team showed thatAI2 can detect 85 percent of attacks, which is roughly three times better than previous benchmarks, while also reducing the number of false positives by a factor of 5. The system was tested on 3.6 billion pieces of data known as “log lines,” which were generated by millions of users over a period of three months.

To predict attacks, AI2 combs through data and detects suspicious activity by clustering the data into meaningful patterns using unsupervised machine-learning. It then presents this activity to human analysts who confirm which events are actual attacks, and incorporates that feedback into its models for the next set of data.

“You can think about the system as a virtual analyst,” says CSAIL research scientist Kalyan Veeramachaneni, who developed AI2 with Ignacio Arnaldo, a chief data scientist at PatternEx and a former CSAIL postdoc. “It continuously generates new models that it can refine in as little as a few hours, meaning it can improve its detection rates significantly and rapidly.”

Veeramachaneni presented a paper about the system at last week’s IEEE International Conference on Big Data Security in New York City.

Creating cybersecurity systems that merge human- and computer-based approaches is tricky, partly because of the challenge of manually labeling cybersecurity data for the algorithms.

For example, let’s say you want to develop a computer-vision algorithm that can identify objects with high accuracy. Labeling data for that is simple: Just enlist a few human volunteers to label photos as either “objects” or “non-objects,” and feed that data into the algorithm.

But for a cybersecurity task, the average person on a crowdsourcing site like AmazonMechanical Turk simply doesn’t have the skillset to apply labels like “DDOS” or “exfiltration attacks,” says Veeramachaneni. “You need security experts.”

That opens up another problem: Experts are busy, and they can’t spend all day reviewing reams of data that have been flagged as suspicious. Companies have been known to give up on platforms that are too much work, so an effective machine-learning system has to be able to improve itself without overwhelming its human overlords.

AI2’s secret weapon is that it fuses together three different unsupervised-learning methods, and then shows the top events to analysts for them to label. It then builds a supervised model that it can constantly refine through what the team calls a “continuous active learning system.”

Specifically, on day one of its training, AI2 picks the 200 most abnormal events and gives them to the expert. As it improves over time, it identifies more and more of the events as actual attacks, meaning that in a matter of days the analyst may only be looking at 30 or 40 events a day.

This week we also want to revisit the new QuEST framework if time permits.  The motivation for the twist we proposed two weeks ago was associated with some of our colleagues making advances in ‘generative’ deep learning models.  Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with asequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (9)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 15 Apr

April 14, 2016 Leave a comment

QuEST 15 April 2016

The first topic is I want to use the meeting to quickly review the intended content for my plenary talk next week at the Defense Sensing Symposia for comment:

The QuEST for multi-sensor big data ISR situation understanding

Steven “Cap” Rogers a*, Jared Culbertson a, Mark Oxleyb, Scott Clouse a, Bernard Abayowa a, James Patrick a, Erik Blaschc, John Trumpfhellerd

ABSTRACT

The challenges for providing war fighters with the best possible actionable information from diverse sensing modalities using advances in big-data and machine learning are addressed in this paper.  We start by presenting ISR related big-data challenges associated with the Third Offset Strategy.  Current approaches to big-data are shown to be limited with respect to reasoning / understanding.  We present a discussion of what meaning making and understanding require.  We posit that for human-machine collaborative solutions to address the requirements for the strategy a new approach, QuEST, will be required.  The requirements for developing a QuEST theory of knowledge are discussed and finally an engineering approach for achieving situation understanding is presented.

The written article has now been approved by PA for ‘A’ distribution – so anyone who wants a copy we will post on the VDL or ask Cathy.

This week we also want to revisit the new QuEST framework.  The motivation for the twist we proposed last week was associated with some of our colleagues making advances in ‘generative’ deep learning models.  Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (8)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 8 Apr

QuEST 8 April 2016

This week has been spent working on a new QuEST framework.  The motivation for the twist is associated with some of our colleagues making advances in ‘generative’ deep learning models.  Most of the QuEST discussions have been associated with discriminative models, other than our recent discussion of the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to discuss is the idea of a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data at the appropriate level of abstraction.

Such a formalism will answer many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we want to discuss some staged experiments that could investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We have two articles on generative models we want to discuss this week that are the current best ones we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (7)

Categories: Uncategorized
Follow

Get every new post delivered to your Inbox.