Weekly QuEST Discussion Topics and News, 22 Sept

September 22, 2017 Leave a comment

QuEST 22 Sept 2017

We might start with a discussion of a couple of recent Rodney Brooks articles – seven deadly sins (predicting AI) and one on the self-driving car’s People problem.

We also want to examine some recent work in representations that capture relationships between entities.  Assuming the recent breakthroughs in finding and identifying objects lead us to a representation of what is present then there are a series of questions we seek to answer – grounding is one in particular:

  • Great progress has been made on object detection, the task of localizing visual entities belonging to a pre-defined set of categories [8, 24, 23, 6, 17]. But the more general and challenging task of localizing entities based on arbitrary natural language expressions remains far from solved.
  • This task, sometimes known as grounding or referential expressioncomprehension, has been explored by recent work in both computer vision and natural language processing [20, 11, 25].

As an example:  Given an image and a natural language expression referring to a visual entitysuch as the young man wearing green shirt and riding a black bicycle, these approaches localize the image region corresponding to the entity that the expression refers to with a bounding box.

 

Modeling Relationships in Referential Expressions with Compositional Modular Networks
Hu – UC Berkeley

  • People often refer to entities in an image in terms of their relationships with other entities.
  • For example, the black cat sitting under the table refers to both a black cat entity and its relationship with another table entity.
  • Understanding these relationships is essential for interpreting and grounding such natural language expressions.
  • Most prior work focuses on either grounding entire referential expressions holistically to one region, or localizing relationships based on a fixed set of categories.
  • In this paper we instead present a modular deep architecture capable of analyzing referential expressions into their component parts, identifying entities and relationships mentioned in the input expression and grounding them all in the scene.
  • We call this approach Compositional Modular Networks (CMNs): a novel architecture that learns linguistic analysis and visual inference end-to-end.
  • Our approach is built around two types of neural modules that inspect local regions and pairwise interactions between regions.
  • We evaluate CMNs on multiple referential expression datasets, outperforming state-of-the-art approaches on all tasks.

We also want to spend time this week discussing visual reasoning.  Specifically some of our colleagues at Facebook recently published:

  • arXiv:1705.03633v1 [cs.CV] 10 May 2017
  • Inferring and Executing Programs for Visual Reasoning
  • Justin Johnson1 Bharath Hariharan2 Laurens van der Maaten2
  • Judy Hoffman1 Li Fei-Fei1 C. Lawrence Zitnick2 Ross Girshick2
  • 1Stanford University 2Facebook AI Research
  • Abstract
  • Existing methods for visual reasoning attempt to directl map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. A a result, these black-box models often learn to exploit biase in the data rather than learning to perform visual reasoning.
  • Inspired by module networks, this paper proposes  model for visual reasoning that consists of a program generato that constructs an explicit representation of the reasonin process to be performed, and an execution engine that executes the resulting program to produce an answer.
  • Both the program generator and the execution engine ar implemented by neural networks, and are trained using  combination of backpropagation and REINFORCE. Usin the CLEVR benchmark for visual reasoning, we show tha our model significantly outperforms strong baselines an generalizes better in a variety of settings.

news summary (69)

Advertisements
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 8 Sept

September 7, 2017 Leave a comment

QuEST 8 Sept 2017

We want to spend time this week discussing visual reasoning.  Specifically some of our colleagues at Facebook recently published:

arXiv:1705.03633v1 [cs.CV] 10 May 2017

Inferring and Executing Programs for Visual Reasoning

Justin Johnson1 Bharath Hariharan2 Laurens van der Maaten2

Judy Hoffman1 Li Fei-Fei1 C. Lawrence Zitnick2 Ross Girshick2

1Stanford University 2Facebook AI Research

Abstract

Existing methods for visual reasoning attempt to directl map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. A a result, these black-box models often learn to exploit biase in the data rather than learning to perform visual reasoning.

Inspired by module networks, this paper proposes  model for visual reasoning that consists of a program generato that constructs an explicit representation of the reasonin process to be performed, and an execution engine that executes the resulting program to produce an answer.

Both the program generator and the execution engine ar implemented by neural networks, and are trained using  combination of backpropagation and REINFORCE. Usin the CLEVR benchmark for visual reasoning, we show tha our model significantly outperforms strong baselines an generalizes better in a variety of settings.

 

We also then want to expand our discussion to a topic we’ve previously covered:

Human-level concept learnin through probabilistic program induction
Brenden M. Lake,1* Ruslan Salakhutdinov,2 Joshua B. Tenenbaum3

11 DECEMBER 2015 • VOL 350 ISSUE 6266 Science

  • People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy.
  • People can also use learned concepts in richer ways than conventional algorithms—foraction, imagination, and explanation.
  • We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets.
  • The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion.
  • On a challenging one-shot classification task, the model achieves human-level performancewhile outperforming recent deep learning approaches.
  • We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.
  • news summary (68)
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 1 Sept

August 31, 2017 Leave a comment

QuEST 1 Sept 2017

We will start this week by talking about a video / paper provided by our colleague Adam R. it is an example of a trend of combining deep learning with model based solutions. Below is the link for a news article that has a video of the author showing some results. It also has a link to a technical CVPR paper.

https://www.datascience.com/blog/beyond-deep-learning-a-case-study-in-sports-analytics

the cvpr paper was entitled:

Learning Online Smooth Predictors for Realtime Camera Planning

using Recurrent Decision Trees —- by J Chen et al

We study the problem of online prediction for realtime camera planning, where the goal is to predict smooth trajectories that correctly track and frame objects of interest (e.g., players in a basketball game). The conventional approach for training predictors does not directly consider temporal consistency, and often produces undesirable jitter. Although post-hoc smoothing (e.g., via a Kalman filter) can mitigate this issue to some degree, it is not ideal due to overly stringent modeling assumptions (e.g., Gaussian noise). We propose a recurrent decision tree framework that can directly incorporate temporal consistency into a data-driven predictor, as well as a learning algorithm that can efficiently learn such temporally smooth models. Our approach does not require any post-processing, making online smooth predictions much easier to generate when the noise model is unknown. We apply our approach to sports broadcasting: given noisy player detections, we learn where the camera should look based on human demonstrations. Our experiments exhibit significant improvements over conventional baselines and showcase the practicality of our approach.

           We also want to continue our discussion from last week on:

Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman

Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition of consciousness the ability for introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is

being sensed or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **

However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches

In this chapter we introduce type I and II signal detection theory (SDT), and describe and evaluate signal detection theoretic measures of metacognition. We discuss practicalities for empirical research with these measures, for example, alternative methods of transforming extreme data scores and of collecting confidence ratings, with the aim of encouraging the use of SDT in research on metacognition. We conclude by discussing metacognition in the context of consciousness.

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y … Consciousness and Cognition 31 (2015) 139–147

Post-decision wagering objectively measures awareness
Navindra Persaud, Peter McLeod & Alan Cowey

2007 Nature Publishing Group http://www.nature.com/natureneuroscience

NATURE NEUROSCIENCE VOLUME 10 [ NUMBER 2 [ FEBRUARY 2007

The lack of an accepted measure of awareness consciousness has made claims that accurate decisions can be made without awareness consciousness controversial. Here we introduce a new objective measure of awareness consciousness, post-decision wagering.

We show that participants fail to maximize cash earnings by wagering high following correct decisions in blindsight, the Iowa gambling task and an artificial grammar task.

This demonstrates, without the uncertainties associated with the conventional subjective measures of awareness

consciousness(verbal reports and confidence ratings), that the participants were not aware that their decisions were correct.

Post-decision wagering may be used to study the neural correlates of consciousness.

Evaluation of a ‘bias-free’ measure of awareness
SIMON EVANS and PAUL AZZOPARDI
Department of Experimental Psychology, University of Oxford

Spatial Vision, Vol. 20, No. 1–2, pp. 61–77 (2007)

 VSP 2007.

Also available online – http://www.brill.nl/sv

AbstractThe derivation of a reliable, subjective measure of awareness that is not contaminated by observers’ response bias is a problem that has long occupied researchers. Kunimoto et al. (2001) proposed a measure of awareness (a’) which apparently meets this criterion: a’ is derived from confidence ratings and is based on the intuition that confidence should reflect awareness.

The aim of this paper is to explore the validity of this measure. Some calculations suggested that, contrary to Kunimoto et al.’s intention, a’ can vary as a result of changes in response bias affecting the relative proportions of high- and low-confidence responses.

This was not evident in the results of Kunimoto et al.’s original experiments because their method may have artificially ‘clamped’ observers’ response bias close to zero.

A predicted consequence of allowing response bias to vary freely is that it can result in a’ varying from negative, through zero, to positive values, for a given value of discriminability (d’).

We tested whether such variations are likely to occur in practice by employing Kunimoto et al.’s paradigm with various modifications, notably the removal of constraints upon the proportions of low- and high-confidence responses, in a visual discrimination task.

As predicted, a’ varied with response bias in all participants. Similar

results were found when a’ was calculated from pre-existing data obtained from a patient with blindsight: a’ varied through a range of positive results without approaching zero, which is inconsistent with his well-documented lack of awareness.

A second experiment showed how response bias could be manipulated to yield elevated values of a’. On the basis of these findings we conclude that Kunimoto’s measure is not as impervious to response bias as was originally assumed.

news summary (67)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 25 Aug

August 24, 2017 Leave a comment

QuEST Aug 25 2017
We want to continue our discussion from last week on:
Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman
• Metacognition, or "knowing that you know", is a core component of
consciousness. ** to many this is part of the definition – introspection **
Insight into a perceptual or conceptual decision permits us to infer
perceptual or conscious knowledge underlying that decision. ** seem to
distinguish decisions made based on what is being sensed and what or
what is being thought about perceptual / conceptual – we would suggest
that even perceptual decisions that are done consciously are put into the
conceptual representation **
• However when assessing metacognitive performance care must be taken to
avoid confounds from decisional and/or confidence biases. There has
recently been substantial progress in this area and there now exist
promising approaches
• In this chapter we introduce type I and II signal detection theory (SDT), and
describe and evaluate signal detection theoretic measures of
metacognition. We discuss practicalities for empirical research with these
measures, for example, alternative methods of transforming extreme data
scores and of collecting confidence ratings, with the aim of encouraging the
use of SDT in research on metacognition. We conclude by discussing
metacognition in the context of consciousness.

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y …
Consciousness and Cognition 31 (2015) 139–147
• Metacognition, or "knowing that you know", is a core component of
consciousness. ** to many this is part of the definition – introspection **
Insight into a perceptual or conceptual decision permits us to infer

perceptual or conscious knowledge underlying that decision. ** seem to
distinguish decisions made based on what is being sensed and what or
what is being thought about perceptual / conceptual – we would suggest
that even perceptual decisions that are done consciously are put into the
conceptual representation **
• However when assessing metacognitive performance care must be taken to
avoid confounds from decisional and/or confidence biases. There has
recently been substantial progress in this area and there now exist
promising approaches
Post-decision wagering objectively measures awareness
Navindra Persaud, Peter McLeod & Alan Cowey
2007 Nature Publishing Group http://www.nature.com/natureneuroscience
NATURE NEUROSCIENCE VOLUME 10 [ NUMBER 2 [ FEBRUARY 2007
• The lack of an accepted measure of awareness consciousness has made
claims that accurate decisions can be made without awareness
consciousness controversial. Here we introduce a new objective measure
of awareness consciousness, post-decision wagering.
• We show that participants fail to maximize cash earnings by wagering high
following correct decisions in blindsight, the Iowa gambling task and an
artificial grammar task.
• This demonstrates, without the uncertainties associated with the
conventional subjective measures of awareness
consciousness(verbal reports and confidence ratings), that the
participants were not aware that their decisions were correct.
• Post-decision wagering may be used to study the neural correlates of
consciousness.

Evaluation of a ‘bias-free’ measure of awareness
SIMON EVANS∗ and PAUL AZZOPARDI
Department of Experimental Psychology, University of Oxford
Spatial Vision, Vol. 20, No. 1–2, pp. 61–77 (2007)

 VSP 2007.
Also available online – http://www.brill.nl/sv
• Abstract—The derivation of a reliable, subjective measure of awareness
that is not contaminated by observers’ response bias is a problem that has
long occupied researchers. Kunimoto et al. (2001) proposed a measure of
awareness (a’) which apparently meets this criterion: a’ is derived from
confidence ratings and is based on the intuition that confidence should
reflect awareness.
• The aim of this paper is to explore the validity of this measure. Some
calculations suggested that, contrary to Kunimoto et al.’s intention, a’ can
vary as a result of changes in response bias affecting the relative
proportions of high- and low-confidence responses.
• This was not evident in the results of Kunimoto et al.’s original experiments
because their method may have artificially ‘clamped’ observers’ response
bias close to zero.
• A predicted consequence of allowing response bias to vary freely is that it
can result in a’ varying from negative, through zero, to positive values, for
a given value of discriminability (d’).
• We tested whether such variations are likely to occur in practice by
employing Kunimoto et al.’s paradigm with various modifications, notably
the removal of constraints upon the proportions of low- and high-
confidence responses, in a visual discrimination task.
• As predicted, a’ varied with response bias in all participants. Similar results
were found when a’ was calculated from pre-existing data obtained from a
patient with blindsight: a’ varied through a range of positive results without
approaching zero, which is inconsistent with his well-documented lack of
awareness.
• A second experiment showed how response bias could be manipulated to
yield elevated values of a’. On the basis of these findings we conclude that
Kunimoto’s measure is not as impervious to response bias as was
originally assumed.

We also want to discuss:
From Science Daily:
Chimpanzees learn rock-paper- scissors
New study shows that chimps' ability to learn simple circular relationships is on a
par with that of 4-year- old children
Date: August 10, 2017
Source: Springer

The Future of
Humans & Machines:
Partnership, Fusion, or Fear?
Summer 2017
The Intelligent Systems Center
Johns Hopkins Applied Physics Laboratory
arXiv:1705.08168v2 [cs.CV] 1 Aug 2017
Look, Listen and Learn
Relja Arandjelovi´cy
relja@google.com
Andrew Zissermany;_
zisserman@google.com
I read an interesting paper this week that is related to your question regarding "what is necessary for
such systems to form and share meaning?" This paper describes a way to combine learning from
different modalities (audio and visual) primarily using unsupervised methods.
In the attached paper, the authors trained an auditory and visual network based on unlabeled 1 second
video clips. They started by creating a training set where half the video clips did not have the true audio
associated with the visual portion of the clip (they randomly assigned a different clip's audio to the
visual portion of the clip). They then trained a single visual and auditory network (each part takes
separate inputs, but the network segments are joined later in the network) to make predictions on
whether the visual and auditory clips belong together (or if they were taken from different clips). Half of
the samples belonged together while the other half did not.
What made their results interesting was that the networks learned a representation without supervised
categorical class labels (playing piano, rock concert, sporting event). This learned representation,
however, was clearly associated with real-world concepts.

I think their results demonstrate a jump forward in the way we can train computers to learn
representations, because it shows that large supervised sets may not be needed to learn a
representation that is capable of understanding visual and auditory inputs. The network merely learned
to associate correlations between different modalities. I would contend this is a more "human"
approach to learning. We learn to associate correlations and then assign meaning to those associations.
It is also unique because they did not try to assign a label and then combine modalities based on the
labels. Rather, they combined the information in the representation space. Then they were able to tear
off the top layer and transfer the learned representation to a supervised setting and still get good results
in traditional classification problems.

There also was a AI colloquium recently –
The video from the AI Colloquium is now available. It can be found at:
https://vimeo.com/album/4721595.
one of the speakers with Prof Russell from Berkeley – he
pointed out that in the deep learning area there are
some obvious issues that many overlook – he used as an
example the picture below with the caption from Google:

Many look at this in amazement – others who are in
more traditional AI areas note:
There is no fruit stand, there is no group of people, there
is no one shopping going on –

news summary (66)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 18 Aug

August 17, 2017 Leave a comment

QuEST Aug 18 2017

Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman

  • Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition – introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is being sensed and what or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **
  • However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches
  • In this chapter we introduce type I and II signal detection theory (SDT), and describe and evaluate signal detection theoretic measures of metacognition. We discuss practicalities for empirical research with these measures, for example, alternative methods of transforming extreme data scores and of collecting confidence ratings, with the aim of encouraging the use of SDT in research on metacognition. We conclude by discussing metacognition in the context of consciousness.

 

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y …

Consciousness and Cognition 31 (2015) 139–147

  • Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition – introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is being sensed and what or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **
  • However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches

Post-decision wagering objectively measures awareness
Navindra Persaud, Peter McLeod & Alan Cowey

2007 Nature Publishing Group http://www.nature.com/natureneuroscience

NATURE NEUROSCIENCE VOLUME 10 [ NUMBER 2 [ FEBRUARY 2007

  • The lack of an accepted measure of awareness consciousness has made claims that accurate decisions can be made without awareness consciousness controversial. Here we introduce a new objective measure of awareness consciousness, post-decision wagering.
  • We show that participants fail to maximize cash earnings by wagering high following correct decisions in blindsight, the Iowa gambling task and an artificial grammar task.
  • This demonstrates, without the uncertainties associated with the conventional subjective measures of awareness consciousness(verbal reports and confidence ratings), that the participants were not aware that their decisions were correct.
  • Post-decision wagering may be used to study the neural correlates of consciousness.

We had some exciting work this summer and we want to give people quick overviews of some of the efforts – we had two deep learning efforts of particular notes by our colleagues Oliver and Washington and then there were at least four projects that might be good to bring up:

*Context-Learning Deep Neural Networks for Kinematic Prediction by Dr. Kyle Tarplee (Anderson Univ.)  – Combines a DNN that learns patterns in traffic to a mixture density network (MDN) that is a DNN designed for motion prediction (c.g. Kalman filters).

Estimating Posteriors from Deep Learning Networks by Nicole Eikmeier (Purdue) – Exploits the stochasticity of drop-out regularization to compute “confidence” values for the network’s performance.

Predictive Simulation for UAV Flight Planning by Saniyah Shaikh (UPenn) – Uses Monte-Carlo Tree Search (MCTS) a la AlphaGo to plan UAV flight paths.

Practical Applications of Graph Convolutional Neural Networks in Sensor Exploitation by Mela Hardin (ASU) – She presented highlights from some recent work in the hot field of graph CNNs, i.e. bringing the power of CNNs to relational data.

 

For those that have VDL access we have wiki pages with details for following up

news summary (65)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 11 Aug

August 10, 2017 Leave a comment

QuEST 11Aug 2017

A recent study by the Harvard Kennedy School – Felfer Center for Science and International Affairs – “Artificial Intelligence and National Security” done for IARPA concluded among other things:

“By looking at four prior cases of transformative military technology—nuclear,

aerospace, cyber, and biotech—we develop lessons learned and recommendations for national security policy toward AI.

Future progress in AI has the potential to be a transformative

national security technology, on a par with nuclear weapons,

aircraft, computers, and biotech.

 

−− Each of these technologies led to significant changes in the

strategy, organization, priorities, and allocated resources of the

U.S. national security community.

−− We argue future progress in AI will be at least equally

impactful.”

 

—- that is an amazing statement – some discussion is warranted.  For example take the lessons from the AlphaGo and extrapolate to multi-domain Command and Control implications?

 

A second set of topics from our Colleague Teresa H:

 

How We Save Face—Researchers Crack the Brain’s Facial-Recognition Code

A Caltech team has deciphered the way we identify faces, re-creating what the brain sees from its electrical activity

 By Knvul Sheikh | Scientific American August 2017 Issue

 

The brain has evolved to recognize and remember many different faces. We can instantly identify a friend’s countenance among dozens in a crowded restaurant or on a busy street. And a brief glance tells us whether that person is excited or angry, happy or sad.

Brain-imaging studies have revealed that several blueberry-size regions in the temporal lobe—the area under the temple—specialize in responding to faces. Neuroscientists call these areas “face patches.” But neither brain scans nor clinical studies of patients with implanted electrodes explained exactly how the cells in these patches work.

The Code for Facial Identity in the Primate Brain

Authors

Le Chang, Doris Y. Tsao

Correspondence

lechang@caltech.edu (L.C.),

dortsao@caltech.edu (D.Y.T.)

In Brief

Facial identity is encoded via a remarkably simple neural code that relies

on the ability of neurons to distinguish facial features along specific axes in face space, disavowing the long-standing assumption that single face cells encode individual faces.

 

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y …

Consciousness and Cognition 31 (2015) 139–147

  • Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition – introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is being sensed and what or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **
  • However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches

 

Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman

 

Metacognition, or “knowing that you know”, is a core component of consciousness. Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches. In this chapter we introduce type I and II signal detection theory (SDT), and describe and evaluate signal detection theoretic measures of metacognition. We discuss practicalities for empirical research with these measures, for example, alternative methods of transforming extreme data scores and of collecting confidence ratings, with the aim of encouraging the use of SDT in research on metacognition. We conclude by discussing metacognition in the context of consciousness.

 

news summary (64)

Categories: Uncategorized

Weekly QuEST Meeting Discussion Topics and News, 4 Aug

August 3, 2017 Leave a comment

QuEST 4 Aug 2017:

This week we will have a guest lecture by colleagues from UCLA to discuss the paper by Achille / Soatto UCLA, arXiv:1706.01350v1 [cs.LG] 5 Jun 2017

On the emergence of invariance and disentangling in deep representations

Lots of interesting analysis in this article but what caught my eye was the discussion on properties of representations:

  • In many applications, the observed data x is high dimensional (e.g., images or video), while the task y is low-dimensional, e.g., a label or a coarsely quantized location. ** what if the task was a simulation – that was stable, consistent and useful – low dimensional?**
  • For this reason, instead of working directly with x, we want to use a representation z that captures all the information the data x contains about the task y, while also being simpler than the data itself.  ** and are there a range of tasks y that can be serviced by a representation z – how do we address the tension between the representation and the tasks – how do we define what tasks can be serviced by a given representation?**
  • Ideally, such a representation should be
  • (a) sufficient for the task y, i.e. I(y; z) = I(y; x), so that information about y is not lostamong all sufficient representations, it should be
  • (b) minimal, i.e. I(z; x) is minimized, so that it retains as little about x as possible, simplifying the role of the classifier; finally, it should be
  • (c) invariant to the effect of nuisances I(z; n) = 0, so that decisions based on the representation z will not overfit to spurious correlations between nuisances n and labels y present in the training dataset
  • Assuming such a representation exists, it would not be unique, since any bijective function preserves all these properties.
  • We can use this fact to our advantage and further aim to make the representation
  • (d) maximally disentangled, i.e., TC(z) is minimal, where disentanglement is often measured as the correlation of the network weights… the paper appears to use total correlation, which is the (presumably one-sided) KL divergence between the joint PDF of the weights and the Naïve Bayes estimate à KL(f(w1, w2, …, wn), f(w1)f(w2)…f(wn))
  • This simplifies the classifier rule, since no information is present in the complicated higher-order correlations between the components of z, a.k.a. “features.”
  • In short, an ideal representation of the data is a minimal sufficient invariant representation that is disentangled.
  • Inferring a representation that satisfies all these properties may seem daunting. However, in this section we show that we only need to enforce (a) sufficiency and (b) minimality, from which invariance and disentanglement follow naturally.
  • Between this and the next section, we will then show that sufficiency and minimality of the learned representation can be promoted easily through implicit or explicit regularization during the training process.

As we mature our view of how to work to these rich representation it brings up the discussion point of QuEST as a platform:

 

I would like to think through a QuEST solution that is a platform that uses existing front ends (application dependent by observation vendors) and existing big-data back ends like standard Big Data Solutions such Amazon Web services … , and possibly a series of knowledge creation vendors  – It is helpful here to consider the Cross Industry Standard Process for Data Mining (commonly known by its acronym CRISPDM, is a data mining process model that describes commonly used steps data mining experts use to tackle data mining problems) to show how QuEST fits within, and can enable, all aspects of the CRISP-DM process.

Independent of the representation used by a front end system that captures the observables and provides them to the QuEST agent – it becomes the quest agent’s job to take them and create two uses for them – the first is put them in a form usable by a big-data solution (following CRISP-DM, this would entail the Data Understanding, and Data Preparation), but do so based on an understanding of the relevant QuEST model (CRISP-DM, Modeling), and in a way that supports CRISP-DM Business Understanding (e.g., perhaps infer it based on its ‘Sys2 Artificial Consciousness’ – the next piece) to find if there exists experiences stored – something close enough to them to provide the appropriate response when in the CRISP-DM Deployed phase – and the second form has to be consistent with our situated / simulation tenets – so they are provided to a ‘simulation’ system that attempts to ‘constrain’ the simulation that will generate the artificially conscious ‘imagined’ present that can complement the ‘big-data’ response – in fact the simulated data might be fed as ‘imagined observables’ into the back end, infer gaps in CRISP-DM Business Understanding that then also feed the big-data response, and offer more valuable contributions to users in CRISP-DM Deployment– I would like to expand on this discussionnews summary (63)

Categories: Uncategorized