What Matters – some notes from Dr. Flach

April 28, 2016 Leave a comment
See below for some notes in response to this week’s announcement from Dr. Flach, who recently authored a book titled ‘What Matters’, which you can download for free from the link below.
I think you will find much overlap between what Hoffman is saying and our book “What Matters.”
However, we would not frame it as a case ‘against reality’ but rather for the fact that experience is the only reality.  That is,
there is no reality independent of experience.  Experience isn’t the “illusion,”  rather the idea of an absolute
reality or truth independent from experience is the illusion. This is fundamental to quantum mechanics.
In chapter 2 of the book we talk about the surprise version of the 20 questions game that John Wheeler (physicist mentioned in article)
used to illustrate the fact that reality does not exist “out there” independent of our experiences. 
Note that although Hoffman has a computational model consistent with this view – the view is not new – it is exactly what William James
argued for in his Radical Empiricism (100 years ago) and also what Pirsig later described as a Metaphysics of Quality.  (note the connections with qualia).
There is also a link in the attached flyer.
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 29 Apr

April 28, 2016 Leave a comment

QuEST 29 April 2016

This week we finally revisit the new QuEST framework.  The original motivation for the twist we proposed several weeks ago was associated with some of our colleagues making advances in ‘generative’ deep learning models.  But we have to discuss a recent article in the Atlantic that captures the thoughts of Prof Hoffman on consciousness:

http://www.theatlantic.com/science/archive/2016/04/the-illusion-of-reality/479559

Case Against Reality

A professor of cognitive science argues that the world is nothing like the one we experience through our senses.

David McNew / Quanta

We noticed that you have an

AD BLOCKER

ENABLED

Please consider disabling it for our site, or supporting our work in one of these ways

As we go about our daily lives, we tend to assume that our perceptions—sights, sounds, textures, tastes—are an accurate portrayal of the real world. Sure, when we stop and think about it—or when we findourselves fooled by a perceptual illusion—we realize with a jolt that what we perceive is never the world directly, but rather our brain’s best guess at what that world is like, a kind of internal simulationof an external reality. Still, we bank on the fact that our simulation is a reasonably decent one. If it wasn’t, wouldn’t evolution have weeded us out by now? The true reality might be forever beyond our reach, but surely our senses give us at least an inkling of what it’s really like.

Not so, says Donald D. Hoffman, a professor of cognitive science at the University of California, Irvine. Hoffman has spent the past three decades studying perception, artificial intelligence, evolutionary game theory and the brain, and his conclusion is a dramatic one: The world presented to us by our perceptions isnothing like reality. What’s more, he says, we have evolution itself to thank for this magnificent illusion, as it maximizes evolutionary fitness by driving truth to extinction.

Getting at questions about the nature of reality, and disentangling the observer from the observed, is an endeavor that straddles the boundaries of neuroscience and fundamental physics. On one side you’ll findresearchers scratching their chins raw trying to understand how a three-pound lump of gray matter obeying nothing more than the ordinary laws of physics can give rise to first-person conscious experience. This is the aptly named “hard problem.”

On the other side are quantum physicists, marveling at the strange fact that quantum systems don’t seem to be definite objects localized in space until we come along to observe them. Experiment after experiment has shown—defying common sense—that if we assume that the particles that make up ordinary objects have an objective, observer-independent existence, we get the wrong answers. The central lesson of quantum physics is clear: There are no public objects sitting out there in some preexisting space. As the physicist John Wheeler put it, “Useful as it is under ordinary circumstances to say that the world exists ‘out there’ independent of us, that view can no longer be upheld.”

We will take this news article and attempt to take the discussion down the path of generative models – and how we can use those tools to generate this ‘stable, consistent and useful perception we call consciousness – from our QuEST persepctive’.

Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system which was generative.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with asequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework – to specifically address what we’ve termed the ‘yellow-frog’ issue.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

The next topic in the lineup is I want to hit the Deep Speech 2 system by Baidu.

Deep Speech 2: End-to-End Speech Recognition in
English and Mandarin
Baidu Research – Silicon Valley AI Lab

arXiv:1512.02595v1 [cs.CL] 8Dec 2015

  • We show that an end-to-end deep learning approach can be used to recognizeeither English or Mandarin Chinese speech—two vastly different languages.
  • Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages.
  • Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system [26].
  • Because of this efficiency, experiments that previously took weeks now run in days.
  • This enables us to iterate more quickly to identify superior architectures and algorithms.
  • As a result, in several cases, our system is competitive with the transcription of human workers when benchmarked on standard datasets.
  • Finally, using a technique called Batch Dispatch with GPUs in the data center, we show that our system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.

news summary (10)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 22 Apr

April 21, 2016 Leave a comment

QuEST 22 April 2016

We want to start this week with a discussion of a recent publication from MIT on

AI^2: Training a big data machine to defend:

Kalyan Veeramachaneni

CSAIL, MIT Cambridge, MA

Ignacio Arnaldo

PatternEx, San Jose, CA

Alfredo Cuesta-Infante, Vamsi Korrapati, Costas Bassias, Ke Li

PatternEx, San Jose, CA

 

Abstract:

  • We present an analyst-in-the-loop security system, where analyst intuition is put together with state-of- the-art machine learning to build an end-to-end active learning system.
  • The system has four key features:

–     a big data behavioral analytics platform,

–     an ensemble of outlier detection methods,

–     a mechanism to obtain feedback from security analysts,

–     and a supervised learning module.

  • When these four components are run in conjunction on a daily basis and arecompared to an unsupervised outlier detection method, detection rate improves by an average of 3.41 x, and false positives are reduced fivefold.
  • We validate our system with a real-world data set consisting of 3.6 billion log lines.
  • These results show that our system is capable of learning to defend against unseen attacks.

This AI^2 work was covered in one of the QuEST news articles from this week:

 

http://news.mit.edu/2016/ai-system-predicts-85-percent-cyber-attacks-using-input-human-experts-0418

AI2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.

Image: Kalyan Veeramachaneni/MIT CSAIL

AI2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.

Image: Kalyan Veeramachaneni/MIT CSAIL

System predicts 85 percent of cyber-attacks using input from human experts

2 combs through data and detects suspicious activity using unsupervised machine-learning. It then presents this activity to human analysts, who confirm which events are actual attacks, and incorporate that feedback into its models for the next set of data.” class=”CToWUd”>

Virtual artificial intelligence analyst developed by the Computer Science and Artificial Intelligence Lab and PatternEx reduces false positives by factor of 5. Watch Video

Adam Conner-Simons | CSAIL
April 18, 2016

Press Contact

Adam Conner-Simons
Email: aconner@csail.mit.edu
Phone: 617-324-9135
MIT Computer Science & Artificial Intelligence Lab

Share

Comment

Today’s security systems usually fall into one of two categories: human or machine. So-called “analyst-driven solutions” rely on rules created by living experts and therefore miss any attacks that don’t match the rules. Meanwhile, today’s machine-learning approaches rely on “anomaly detection,” which tends to trigger false positives that both create distrust of the system and end up having to be investigated by humans, anyway.

But what if there were a solution that could merge those two worlds? What would it look like?

In a new paper, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the machine-learning startup PatternEx demonstrate an artificial intelligence platform called AI2 that predicts cyber-attacks significantly better than existing systems by continuously incorporating input from human experts. (The name comes from merging artificial intelligence with what the researchers call “analyst intuition.”)

In a new paper, researchers from MIT’s Computer Science and Artificial The team showed thatAI2 can detect 85 percent of attacks, which is roughly three times better than previous benchmarks, while also reducing the number of false positives by a factor of 5. The system was tested on 3.6 billion pieces of data known as “log lines,” which were generated by millions of users over a period of three months.

To predict attacks, AI2 combs through data and detects suspicious activity by clustering the data into meaningful patterns using unsupervised machine-learning. It then presents this activity to human analysts who confirm which events are actual attacks, and incorporates that feedback into its models for the next set of data.

“You can think about the system as a virtual analyst,” says CSAIL research scientist Kalyan Veeramachaneni, who developed AI2 with Ignacio Arnaldo, a chief data scientist at PatternEx and a former CSAIL postdoc. “It continuously generates new models that it can refine in as little as a few hours, meaning it can improve its detection rates significantly and rapidly.”

Veeramachaneni presented a paper about the system at last week’s IEEE International Conference on Big Data Security in New York City.

Creating cybersecurity systems that merge human- and computer-based approaches is tricky, partly because of the challenge of manually labeling cybersecurity data for the algorithms.

For example, let’s say you want to develop a computer-vision algorithm that can identify objects with high accuracy. Labeling data for that is simple: Just enlist a few human volunteers to label photos as either “objects” or “non-objects,” and feed that data into the algorithm.

But for a cybersecurity task, the average person on a crowdsourcing site like AmazonMechanical Turk simply doesn’t have the skillset to apply labels like “DDOS” or “exfiltration attacks,” says Veeramachaneni. “You need security experts.”

That opens up another problem: Experts are busy, and they can’t spend all day reviewing reams of data that have been flagged as suspicious. Companies have been known to give up on platforms that are too much work, so an effective machine-learning system has to be able to improve itself without overwhelming its human overlords.

AI2’s secret weapon is that it fuses together three different unsupervised-learning methods, and then shows the top events to analysts for them to label. It then builds a supervised model that it can constantly refine through what the team calls a “continuous active learning system.”

Specifically, on day one of its training, AI2 picks the 200 most abnormal events and gives them to the expert. As it improves over time, it identifies more and more of the events as actual attacks, meaning that in a matter of days the analyst may only be looking at 30 or 40 events a day.

This week we also want to revisit the new QuEST framework if time permits.  The motivation for the twist we proposed two weeks ago was associated with some of our colleagues making advances in ‘generative’ deep learning models.  Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with asequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (9)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 15 Apr

April 14, 2016 Leave a comment

QuEST 15 April 2016

The first topic is I want to use the meeting to quickly review the intended content for my plenary talk next week at the Defense Sensing Symposia for comment:

The QuEST for multi-sensor big data ISR situation understanding

Steven “Cap” Rogers a*, Jared Culbertson a, Mark Oxleyb, Scott Clouse a, Bernard Abayowa a, James Patrick a, Erik Blaschc, John Trumpfhellerd

ABSTRACT

The challenges for providing war fighters with the best possible actionable information from diverse sensing modalities using advances in big-data and machine learning are addressed in this paper.  We start by presenting ISR related big-data challenges associated with the Third Offset Strategy.  Current approaches to big-data are shown to be limited with respect to reasoning / understanding.  We present a discussion of what meaning making and understanding require.  We posit that for human-machine collaborative solutions to address the requirements for the strategy a new approach, QuEST, will be required.  The requirements for developing a QuEST theory of knowledge are discussed and finally an engineering approach for achieving situation understanding is presented.

The written article has now been approved by PA for ‘A’ distribution – so anyone who wants a copy we will post on the VDL or ask Cathy.

This week we also want to revisit the new QuEST framework.  The motivation for the twist we proposed last week was associated with some of our colleagues making advances in ‘generative’ deep learning models.  Generative models are also included in our ongoing DARPA led TRACE effort to develop a Synthetic Aperture Radar Target Recognition and Adaption in contested Environments (TRACE).  Most of the previous QuEST discussions have been associated with discriminative models.  We had recently discussed the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to continue discussing is a framework using a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data also generated at the appropriate level of abstraction.

Such a formalism will address many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we discussed some staged experiments that will investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.  The team is maturing these ideas and we will open the floor again for comments.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We also still have two articles on generative models we want to discuss that we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (8)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 8 Apr

QuEST 8 April 2016

This week has been spent working on a new QuEST framework.  The motivation for the twist is associated with some of our colleagues making advances in ‘generative’ deep learning models.  Most of the QuEST discussions have been associated with discriminative models, other than our recent discussion of the DRAW (Google) system.

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.  Recall that DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images.

The question we want to discuss is the idea of a single cognitive system versus a dual (Stanovich/Evans) framework.  In the new proposed formulation the role of consciousness is NOT to provide a second deliberative cognitive system but to complement / infer bottom up (BU) sensory data and processing done in Sys1 (subconscious) with additional data at the appropriate level of abstraction.

Such a formalism will answer many of the concerns of some of our QuEST colleagues that advocate a very powerful Sys1 (example Robert P).  From an engineering perspective we want to discuss some staged experiments that could investigate the potential power of the new approach using generative models for the ‘conscious’ augmentation of sensory BU processing.

The first experiment will be by the AFRL conscious content curation team – AC3 – they will attempt to use the idea on a couple of examples of caption generation where we currently are making unacceptable captions, examples where the current deep learning system fails.  The Question becomes ‘is there a generative augmentation that could in a couple of understood examples – would result in a more acceptable caption?’

After we do that the next experiments will be what we will call ‘Deep Sleep learning’ or maybe ‘Deep Dreaming Learning NN’.  The idea is we let generative models augment the learning from training data.  This simulates a role dreaming may play.  The training with the BU stages being the outputs of our generative models mimics this idea.  The question becomes would deep learning solutions having been allowed to dream do better on ‘unexpected’ queries, data that was NOT from the training set in terms of composition of training set linguistic expressions.  I would envision training using our current best deep learning algorithms, then allowing the system to go off-line and dream where the stages are provided generative model data to learn from.  Then back to the training data …  The composition of the dreams will be generative model constructed ‘realities’ ~ dreams.

Lastly the experiment we envision is a framework where in-use, when there is a query that will allow the online intervention of the ‘conscious’ augmentation (think of the Hammond formalism on dimensionality, time to respond, …), we have the conscious top-down (TD) conditioning of the BU subconscious Sys1 deliberation.

We have two articles on generative models we want to discuss this week that are the current best ones we’ve reviewed for use in the conscious TD framework – both related to generative adversarial networks:

Deep Generative Image Models using a
Laplacian Pyramid of Adversarial Networks
Denton ( Courant Institute), et al (Facebook)

arXiv:1506.05751v1 [cs.CV] 18 Jun 2015

  • In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.
  • Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion.
  • At each level of the pyramid, a separate generative convent model is trained using the Generative Adversarial Nets (GAN) approach [10].
  • Samples drawn from our model are of significantly higher quality than alternate approaches.
  • In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model.
  • We also show samples from models trained on the higher resolution images of the LSUN scene dataset.

The second article:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

news summary (7)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 1 Apr

March 31, 2016 Leave a comment

QuEST April 1, 2016

The topic this week is a discussion about the unexpected query – specifically ‘zero-shot learning’.  We will use an article by Socher / Manning / Ng from NIPS 2013:

Zero-Shot Learning Through Cross-Modal Transfer
Richard Socher, Milind Ganjoo, Christopher D. Manning, Andrew Y. Ng

Zero-Shot Learning Through Cross-Modal Transfer
Richard Socher, Milind Ganjoo, Christopher D. Manning, Andrew Y. Ng

  • This work introduces a model that can recognize objects in images even if no training data is available for the object class.
  • The only necessary knowledge about unseen visual categories comes from unsupervised text corpora.

Related to question of the unexpected query – but unexpected with respect to the image classification system – not to the word / text processing system – so a sort of  transfer learning issue – transfer between systems

  • Unlike previous zero-shot learning models, which can only differentiate between unseen classes, our model can operate on a mixture of seen and unseen classes, simultaneously obtaining state of the art performance on classes with thousands of training images and reasonable performance on unseen classes.
  • This is achieved by seeing the distributions of words in texts as a semantic space for understanding what objects look like.
  • Our deep learning model does not require any manually defined semantic or visual featuresfor either words or images.
  • Images are mapped to be close to semantic word vectors corresponding to their classes, and the resulting image embeddings can be used to distinguish whether an image is of a seen or unseen class.
  • We then use novelty detection methods to differentiate unseen classes from seen classes.
  • We demonstrate two novelty detection strategies;
  • the first gives high accuracy on unseen classes,
  • while the second is conservative in its prediction of novelty and keeps the seen classes’ accuracy high.

news summary (6)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 25 Mar

March 24, 2016 Leave a comment

Dynamic Memory Network, out of MetaMind will be discussed.  Although we started this discussion two weeks ago – the importance of their effort warrants a more in depth consideration for its implications to QuEST.

 

http://www.nytimes.com/2016/03/07/technology/taking-baby-steps-toward-software-that-reasons-like-humans.html?_r=0

Taking Baby Steps Toward Software That Reasons Like Humans

Bits

By JOHN MARKOFF MARCH 6, 2016

Richard Socher, founder and chief executive of MetaMind, a start-up developing artificial intelligence software. Credit Jim Wilson/The New York Times

Richard Socher appeared nervous as he waited for his artificial intelligence program to answer a simple question: “Is the tennis player wearing a cap?”

The word “processing” lingered on his laptop’s display for what felt like an eternity. Then the program offered the answer a human might have given instantly: “Yes.”

Mr. Socher, who clenched his fist to celebrate his small victory, is the founder of one of a torrent of Silicon Valley start-ups intent on pushing variations of a new generation of pattern recognition software, which, when combined with increasingly vast sets of data, is revitalizing the field of artificial intelligence.

His company MetaMind, which is in crowded offices just off the Stanford University campus in Palo Alto, Calif., was founded in 2014 with $8 million in financial backing from Marc Benioff, chief executive of the business software company Salesforce, and the venture capitalist Vinod Khosla.

MetaMind is now focusing on one of the most daunting challenges facing A.I. software.Computers are already on their way to identifying objects in digital images or converting sounds uttered by human voices into natural language. But the field of artificial intelligence has largely stumbled in giving computers the ability to reason in ways that mimic human thought.

Now a variety of machine intelligence software approaches known as “deep learning” or “deep neural nets” are taking baby steps toward solving problems like a human.

On Sunday, MetaMind published a paper describing advances its researchers have made in creating software capable of answering questions about the contents of both textual documents and digital images.

The new research is intriguing because it indicates that steady progress is being made toward “conversational” agents that can interact with humans. The MetaMind results also underscore how far researchers have to go to match human capabilities.

Other groups have previously made progress on discrete problems, but generalized systems that approach human levels of understanding and reasoning have not been developed.

Five years ago, IBM’s Watson system demonstrated that it was possible to outperform humans on “Jeopardy!”

Last year, Microsoft developed a “chatbot” program known as Xiaoice (pronouncedShao-ice) that is designed to engage humans in extended conversation on a diverse set of general topics.

To add to Xiaoice’s ability to offer realistic replies, the company developed a huge library of human question-and-answer interactions mined from social media sites in China. This made it possible for the program to respond convincingly to typed questions or statements from users.

In 2014, computer scientists at Google, Stanford and other research groups made significant advances in what is described as “scene understanding,” the ability tounderstand and describe a scene or picture in natural language, by combining the output of different types of deep neural net programs.

These programs were trained on images that humans had previously described. The approach made it possible for the software to examine a new image and describe it with a natural-language sentence.

While even machine vision is not yet a solved problem, steady, if incremental, progress continues to be made by start-ups like Mr. Socher’s; giant technology companies such as Facebook, Microsoft and Google; and dozens of research groups.

In their recent paper, the MetaMind researchers argue that the company’s approach, known as a dynamic memory network, holds out the possibility of simultaneously processing inputs including sound, sight and text. ** fusion **

The design of MetaMind software is evidence that neural network software technologies are becoming more sophisticated, in this case by adding the ability both to remember a sequence of statements and to focus on portions of an image. For example, a question like “What is the pattern on the cat’s fur on its tail?” might yield the answer “stripes” and show that the program had focused only on the cat’s tail to arrive at its answer.

“Another step toward really understanding images is, are you actually able to answer questions that have a right or wrong answer?” Mr. Socher said.

MetaMind is using the technology for commercial applications like automated customer support, he said. For example, insurance companies have asked if the MetaMind technology could respond to an email with an attached photo — perhaps of damage to a car or other property — he said.

There are two papers that we will use for the technical detail:

Ask Me Anything: Dynamic Memory Networks
for Natural Language Processing:

  • Most tasks in natural language processing can be cast into question answering (QA) problems over language input.  ** way we cast QuEST  Query response**
  • We introduce the dynamic memory network (DMN), a unified neural network frameworkwhich processes input sequences and questions, forms semantic and episodic memories, andgenerates relevant answers.
  • The DMN can be trained end-to-end and obtains state of the art results on several types of tasks and datasets:
  • question answering (Facebook’s bAbI dataset),
  • sequence modeling for part of speech tagging (WSJ-PTB),
  • and text classification for sentiment analysis (Stanford Sentiment Treebank).
  • The model relies exclusively on trained word vector representations and requires no string matching or manually engineered features.

 

The second paper:

Dynamic Memory Networks for Visual and Textual Question Answering
Xiong, Merity, Socher – arXiv:1603.01417v1 [cs.NE] 4 Mar 2016

  • Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering.
  • One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks.

–     However, it was not shown whether the architecture achieves strong results for question answering when supporting facts are not marked during training or whether it could be applied to other modalities such as images.

–     Based on an analysis of the DMN, we propose several improvements to its memory and input modules.

–     Together with these changes we introduce a novel input module for images in order to be able to answer visual questions.

–     Our new DMN+ model improves the state of the art on both the

  • Visual Question Answering dataset and
  • the bAbI-10k text question-answering dataset without supporting fact supervision.

news summary (5)

 

 

Categories: Uncategorized
Follow

Get every new post delivered to your Inbox.