Archive

Author Archive

Weekly QuEST Discussion Topics and News, 13 Oct

October 12, 2017 Leave a comment

QuEST Friday 13th October

Some of us have been having a side-bar discussion on meaning – specifically related to the idea of ‘conversations as a platform’.  To expound on that issue we want to revisit some prior discussions on ‘big-data’.

For example:  Big Data – QuEST perspectives v11 short deck – AC3 inserts (AFRL conscious content curation).

  • From the IQT (In-Q-tel) quarterly (vol 7 no 2) fall 2015 issue – discusses “Artificial Intelligence gets Real”.
  • Predictions with Big Data By Devavrat Shah:

–     We know how to collect massive amounts of data (e.g., web scraping, social media, mobile phones),

–     how to store it efficiently to enable queries at scale (e.g., Hadoop File System, Cassandra) and

–     how to perform computation (analytics) at scale with it (e.g., Hadoop, MapReduce).

–     And we can sometimes visualize it (e.g., New York Times visualizations).

But from a QuEST perspective:

  • Current approaches to big-data bring extremely valuable insights – even in very large data sets with low information density
  • These approaches do so finding correlations
  • Most often they don’t attempt to answer questions on causation
  • QuEST seeks to deliver a simulation based deliberation approach (not correlation not causation)

–     degrees of freedom for simulation possibly chosen via ‘big-data’ infrastructure

  • Using the situated simulation consciousness provides an alternative to the issues above – you don’t have to have the experiences and been able to articulate a model to be able to understand causation – BUT – you also don’t have to have experience all of the data to be able to relate to prior data – the simulation approach provides something between or maybe outside of those – better than both?

The second topic follows this reasoning specifically with modeling relationships:

Modeling Relationships in Referential Expressions
with Compositional Modular Networks
Hu – UC Berkeley

  • People often refer to entities in an image in terms of their relationships with other entities.
  • For example, the black cat sitting under the table refers to both a black cat entity and its relationship with another table entity.
  • Understanding these relationships is essential for interpreting and grounding such natural language expressions.

Most prior work focuses on either grounding entire referential expressions holistically to one region, or localizing relationships based on a fixed set of categories

From our prior discussions on meaning:

  • Meaning, value and such like, are not intrinsic properties of things in the way that their mass or shape is.
  • They are relational properties.
  • Meaning is use, as Wittgenstein put it.
  • Meaning is not intrinsic, as Dennett has put it.
  • And here’s the point: if you know everything there is to know about that web, then you know everything there is to know about the data.

And the precursor to that work:

Neural Module Networks
Jacob Andreas Marcus Rohrbach Trevor Darrell Dan Klein
University of California, Berkeley
{jda,rohrbach,trevor,klein}@eecs.berkeley.edu

  • Visual question answering is fundamentally compositional in nature—a question like where is the dog? Shares substructure with questions like what color is the dog?And where is the cat?
  • This paper seeks to simultaneously exploit the representational capacity of deep networks and the compositional linguistic structure of questions.
  • We describe a procedure for constructing and learning neural module networks, which compose collections of jointly-trained neural “modules” into deep networks for question answering.
  • Our approach decomposes questions into their linguistic substructures, and uses these structures to dynamically instantiate modular networks (with reusable components for recognizing dogs, classifying colors, etc.). 
  • The resulting compound networks are jointly trained.
  • We evaluate our approach on two challenging datasets for visual question answering, achieving state-of-the-art results on both the VQA natural image dataset and a new dataset of complex questions about abstract shapes.

new summary

Advertisements
Categories: Uncategorized

Weekly QuEST Discussion Topics, 6 Oct

October 5, 2017 Leave a comment

QuEST 6 Oct 2017

We want to start this week by discussing the paper:

The Consciousness Prior
Yoshua Bengio
Université de Montréal, MILA
September 26, 2017

arXiv:1709.08568v1 [cs.LG] 25 Sep 2017

  • new prior is proposed for representation learning, which can be combined with other priors in order to help disentangling abstract factors from each other.
  • It is inspired by the phenomenon of consciousness seen as the formation of a low-dimensional combination of a few concepts constituting a conscious thought, i.e., consciousness as awareness at a particular time instant. ** very consistent with our position of qualia as the vocabulary of conscious thoughts – and that it is a lower dimensional representation versus the data space **
  • This provides a powerful constraint on the representation in that such low-dimensional thought vectors can correspond to statements about reality which are either true, highly probable, or very useful for taking decisions.  ** to get a stable consistent and useful representation is the objective **
  • The fact that a few elements of the current state can be combined into such a predictive or useful statement is a strong constraint and deviates considerably from the maximum likelihood approaches to modeling data and how states unfold in the future based on an agent’s actions.

Instead of making predictions in the sensory (e.g. pixel) space, the consciousness priorallow the agent to make predictions in the abstract space, with only a few dimensions of that space being involved in each of these predictions

  • The consciousness prior also makes it natural to map conscious states to natural language utterances or to express classical AI knowledge in the form of facts and rules, although the conscious states may be richer than what can be expressed easily in the form of a sentence, a fact or a rule.
Categories: Uncategorized

Weekly QuEST Discussion Topics, 29 Sept

September 28, 2017 Leave a comment

QuEST 29 Sept 2017

First apologies for last week.  We had major email issues and a new room so the phone lines never opened up – annoying as the in the room group had a very productive discussion.

This week we want to have a shortened QuEST meeting – we have to break no later than 45 minutes.  We will focus on the two Rodney Brooks articles we mentioned in last week’s call but never got to in the meeting.

The Self-Driving car’s People Problem:  Rodney Brooks – Aug 2017 IEEE Spectrum, pg 34

Robotic cars won’t understand us – and we won’t cut them much slack

  • If walking on a country road on a moonless night – and a car approaches – I get out of the road and climb a tree – I don’t trust that the driver will see me and not mow me down
  • In daylight I can look at the driver’s eyes – currently autonomous cars can ‘tell if the two people talking on the sidewalk are going to step out into the road or having a conversation – or if it is a mother and a child waiting for a School Bus’ –
  • In Cambridge – small streets – people cross anywhere – eye contact and body language
  • Autonomous cars that work well in one area might not in another

This article ends with Amara’s law:

  • “WE TEND TO OVERESTIMATE THE EFFECT OF TECHNOLOGY IN THE SHORT RUN AND UNDERESTIMATE THE EFFECT IN THE LONG RUN”

That leads to the second Brooks article:

The Seven Deadly Sins of Predicting the Future of AI

https://rodneybrooks.com/the-seven-deadly-sins-of-predicting-the-future-of-ai/

We are surrounded by hysteria about the future of Artificial Intelligence and Robotics. There is hysteria about how powerful they will become how quickly, and there is hysteria about what they will do to jobs.

The claims are ludicrous. [I try to maintain professional language, but sometimes…] For instance, it appears to say that we will go from 1 million grounds and maintenance workers in the US to only 50,000 in 10 to 20 years, because robots will take over those jobs. How many robots are currently operational in those jobs? ZERO. How many realistic demonstrations have there been of robots working in this arena? ZERO.Similar stories apply to all the other job categories in this diagram where it is suggested that there will be massive disruptions of 90%, and even as much as 97%, in jobs that currently require physical presence at some particular job site.

Below I outline seven ways of thinking that lead to mistaken predictions about robotics and Artificial Intelligence. We find instances of these ways of thinking in many of the predictions about our AI future.

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 22 Sept

September 22, 2017 Leave a comment

QuEST 22 Sept 2017

We might start with a discussion of a couple of recent Rodney Brooks articles – seven deadly sins (predicting AI) and one on the self-driving car’s People problem.

We also want to examine some recent work in representations that capture relationships between entities.  Assuming the recent breakthroughs in finding and identifying objects lead us to a representation of what is present then there are a series of questions we seek to answer – grounding is one in particular:

  • Great progress has been made on object detection, the task of localizing visual entities belonging to a pre-defined set of categories [8, 24, 23, 6, 17]. But the more general and challenging task of localizing entities based on arbitrary natural language expressions remains far from solved.
  • This task, sometimes known as grounding or referential expressioncomprehension, has been explored by recent work in both computer vision and natural language processing [20, 11, 25].

As an example:  Given an image and a natural language expression referring to a visual entitysuch as the young man wearing green shirt and riding a black bicycle, these approaches localize the image region corresponding to the entity that the expression refers to with a bounding box.

 

Modeling Relationships in Referential Expressions with Compositional Modular Networks
Hu – UC Berkeley

  • People often refer to entities in an image in terms of their relationships with other entities.
  • For example, the black cat sitting under the table refers to both a black cat entity and its relationship with another table entity.
  • Understanding these relationships is essential for interpreting and grounding such natural language expressions.
  • Most prior work focuses on either grounding entire referential expressions holistically to one region, or localizing relationships based on a fixed set of categories.
  • In this paper we instead present a modular deep architecture capable of analyzing referential expressions into their component parts, identifying entities and relationships mentioned in the input expression and grounding them all in the scene.
  • We call this approach Compositional Modular Networks (CMNs): a novel architecture that learns linguistic analysis and visual inference end-to-end.
  • Our approach is built around two types of neural modules that inspect local regions and pairwise interactions between regions.
  • We evaluate CMNs on multiple referential expression datasets, outperforming state-of-the-art approaches on all tasks.

We also want to spend time this week discussing visual reasoning.  Specifically some of our colleagues at Facebook recently published:

  • arXiv:1705.03633v1 [cs.CV] 10 May 2017
  • Inferring and Executing Programs for Visual Reasoning
  • Justin Johnson1 Bharath Hariharan2 Laurens van der Maaten2
  • Judy Hoffman1 Li Fei-Fei1 C. Lawrence Zitnick2 Ross Girshick2
  • 1Stanford University 2Facebook AI Research
  • Abstract
  • Existing methods for visual reasoning attempt to directl map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. A a result, these black-box models often learn to exploit biase in the data rather than learning to perform visual reasoning.
  • Inspired by module networks, this paper proposes  model for visual reasoning that consists of a program generato that constructs an explicit representation of the reasonin process to be performed, and an execution engine that executes the resulting program to produce an answer.
  • Both the program generator and the execution engine ar implemented by neural networks, and are trained using  combination of backpropagation and REINFORCE. Usin the CLEVR benchmark for visual reasoning, we show tha our model significantly outperforms strong baselines an generalizes better in a variety of settings.

news summary (69)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 8 Sept

September 7, 2017 Leave a comment

QuEST 8 Sept 2017

We want to spend time this week discussing visual reasoning.  Specifically some of our colleagues at Facebook recently published:

arXiv:1705.03633v1 [cs.CV] 10 May 2017

Inferring and Executing Programs for Visual Reasoning

Justin Johnson1 Bharath Hariharan2 Laurens van der Maaten2

Judy Hoffman1 Li Fei-Fei1 C. Lawrence Zitnick2 Ross Girshick2

1Stanford University 2Facebook AI Research

Abstract

Existing methods for visual reasoning attempt to directl map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. A a result, these black-box models often learn to exploit biase in the data rather than learning to perform visual reasoning.

Inspired by module networks, this paper proposes  model for visual reasoning that consists of a program generato that constructs an explicit representation of the reasonin process to be performed, and an execution engine that executes the resulting program to produce an answer.

Both the program generator and the execution engine ar implemented by neural networks, and are trained using  combination of backpropagation and REINFORCE. Usin the CLEVR benchmark for visual reasoning, we show tha our model significantly outperforms strong baselines an generalizes better in a variety of settings.

 

We also then want to expand our discussion to a topic we’ve previously covered:

Human-level concept learnin through probabilistic program induction
Brenden M. Lake,1* Ruslan Salakhutdinov,2 Joshua B. Tenenbaum3

11 DECEMBER 2015 • VOL 350 ISSUE 6266 Science

  • People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy.
  • People can also use learned concepts in richer ways than conventional algorithms—foraction, imagination, and explanation.
  • We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets.
  • The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion.
  • On a challenging one-shot classification task, the model achieves human-level performancewhile outperforming recent deep learning approaches.
  • We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.
  • news summary (68)
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 1 Sept

August 31, 2017 Leave a comment

QuEST 1 Sept 2017

We will start this week by talking about a video / paper provided by our colleague Adam R. it is an example of a trend of combining deep learning with model based solutions. Below is the link for a news article that has a video of the author showing some results. It also has a link to a technical CVPR paper.

https://www.datascience.com/blog/beyond-deep-learning-a-case-study-in-sports-analytics

the cvpr paper was entitled:

Learning Online Smooth Predictors for Realtime Camera Planning

using Recurrent Decision Trees —- by J Chen et al

We study the problem of online prediction for realtime camera planning, where the goal is to predict smooth trajectories that correctly track and frame objects of interest (e.g., players in a basketball game). The conventional approach for training predictors does not directly consider temporal consistency, and often produces undesirable jitter. Although post-hoc smoothing (e.g., via a Kalman filter) can mitigate this issue to some degree, it is not ideal due to overly stringent modeling assumptions (e.g., Gaussian noise). We propose a recurrent decision tree framework that can directly incorporate temporal consistency into a data-driven predictor, as well as a learning algorithm that can efficiently learn such temporally smooth models. Our approach does not require any post-processing, making online smooth predictions much easier to generate when the noise model is unknown. We apply our approach to sports broadcasting: given noisy player detections, we learn where the camera should look based on human demonstrations. Our experiments exhibit significant improvements over conventional baselines and showcase the practicality of our approach.

           We also want to continue our discussion from last week on:

Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman

Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition of consciousness the ability for introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is

being sensed or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **

However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches

In this chapter we introduce type I and II signal detection theory (SDT), and describe and evaluate signal detection theoretic measures of metacognition. We discuss practicalities for empirical research with these measures, for example, alternative methods of transforming extreme data scores and of collecting confidence ratings, with the aim of encouraging the use of SDT in research on metacognition. We conclude by discussing metacognition in the context of consciousness.

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y … Consciousness and Cognition 31 (2015) 139–147

Post-decision wagering objectively measures awareness
Navindra Persaud, Peter McLeod & Alan Cowey

2007 Nature Publishing Group http://www.nature.com/natureneuroscience

NATURE NEUROSCIENCE VOLUME 10 [ NUMBER 2 [ FEBRUARY 2007

The lack of an accepted measure of awareness consciousness has made claims that accurate decisions can be made without awareness consciousness controversial. Here we introduce a new objective measure of awareness consciousness, post-decision wagering.

We show that participants fail to maximize cash earnings by wagering high following correct decisions in blindsight, the Iowa gambling task and an artificial grammar task.

This demonstrates, without the uncertainties associated with the conventional subjective measures of awareness

consciousness(verbal reports and confidence ratings), that the participants were not aware that their decisions were correct.

Post-decision wagering may be used to study the neural correlates of consciousness.

Evaluation of a ‘bias-free’ measure of awareness
SIMON EVANS and PAUL AZZOPARDI
Department of Experimental Psychology, University of Oxford

Spatial Vision, Vol. 20, No. 1–2, pp. 61–77 (2007)

 VSP 2007.

Also available online – http://www.brill.nl/sv

AbstractThe derivation of a reliable, subjective measure of awareness that is not contaminated by observers’ response bias is a problem that has long occupied researchers. Kunimoto et al. (2001) proposed a measure of awareness (a’) which apparently meets this criterion: a’ is derived from confidence ratings and is based on the intuition that confidence should reflect awareness.

The aim of this paper is to explore the validity of this measure. Some calculations suggested that, contrary to Kunimoto et al.’s intention, a’ can vary as a result of changes in response bias affecting the relative proportions of high- and low-confidence responses.

This was not evident in the results of Kunimoto et al.’s original experiments because their method may have artificially ‘clamped’ observers’ response bias close to zero.

A predicted consequence of allowing response bias to vary freely is that it can result in a’ varying from negative, through zero, to positive values, for a given value of discriminability (d’).

We tested whether such variations are likely to occur in practice by employing Kunimoto et al.’s paradigm with various modifications, notably the removal of constraints upon the proportions of low- and high-confidence responses, in a visual discrimination task.

As predicted, a’ varied with response bias in all participants. Similar

results were found when a’ was calculated from pre-existing data obtained from a patient with blindsight: a’ varied through a range of positive results without approaching zero, which is inconsistent with his well-documented lack of awareness.

A second experiment showed how response bias could be manipulated to yield elevated values of a’. On the basis of these findings we conclude that Kunimoto’s measure is not as impervious to response bias as was originally assumed.

news summary (67)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 25 Aug

August 24, 2017 Leave a comment

QuEST Aug 25 2017
We want to continue our discussion from last week on:
Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman
• Metacognition, or "knowing that you know", is a core component of
consciousness. ** to many this is part of the definition – introspection **
Insight into a perceptual or conceptual decision permits us to infer
perceptual or conscious knowledge underlying that decision. ** seem to
distinguish decisions made based on what is being sensed and what or
what is being thought about perceptual / conceptual – we would suggest
that even perceptual decisions that are done consciously are put into the
conceptual representation **
• However when assessing metacognitive performance care must be taken to
avoid confounds from decisional and/or confidence biases. There has
recently been substantial progress in this area and there now exist
promising approaches
• In this chapter we introduce type I and II signal detection theory (SDT), and
describe and evaluate signal detection theoretic measures of
metacognition. We discuss practicalities for empirical research with these
measures, for example, alternative methods of transforming extreme data
scores and of collecting confidence ratings, with the aim of encouraging the
use of SDT in research on metacognition. We conclude by discussing
metacognition in the context of consciousness.

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y …
Consciousness and Cognition 31 (2015) 139–147
• Metacognition, or "knowing that you know", is a core component of
consciousness. ** to many this is part of the definition – introspection **
Insight into a perceptual or conceptual decision permits us to infer

perceptual or conscious knowledge underlying that decision. ** seem to
distinguish decisions made based on what is being sensed and what or
what is being thought about perceptual / conceptual – we would suggest
that even perceptual decisions that are done consciously are put into the
conceptual representation **
• However when assessing metacognitive performance care must be taken to
avoid confounds from decisional and/or confidence biases. There has
recently been substantial progress in this area and there now exist
promising approaches
Post-decision wagering objectively measures awareness
Navindra Persaud, Peter McLeod & Alan Cowey
2007 Nature Publishing Group http://www.nature.com/natureneuroscience
NATURE NEUROSCIENCE VOLUME 10 [ NUMBER 2 [ FEBRUARY 2007
• The lack of an accepted measure of awareness consciousness has made
claims that accurate decisions can be made without awareness
consciousness controversial. Here we introduce a new objective measure
of awareness consciousness, post-decision wagering.
• We show that participants fail to maximize cash earnings by wagering high
following correct decisions in blindsight, the Iowa gambling task and an
artificial grammar task.
• This demonstrates, without the uncertainties associated with the
conventional subjective measures of awareness
consciousness(verbal reports and confidence ratings), that the
participants were not aware that their decisions were correct.
• Post-decision wagering may be used to study the neural correlates of
consciousness.

Evaluation of a ‘bias-free’ measure of awareness
SIMON EVANS∗ and PAUL AZZOPARDI
Department of Experimental Psychology, University of Oxford
Spatial Vision, Vol. 20, No. 1–2, pp. 61–77 (2007)

 VSP 2007.
Also available online – http://www.brill.nl/sv
• Abstract—The derivation of a reliable, subjective measure of awareness
that is not contaminated by observers’ response bias is a problem that has
long occupied researchers. Kunimoto et al. (2001) proposed a measure of
awareness (a’) which apparently meets this criterion: a’ is derived from
confidence ratings and is based on the intuition that confidence should
reflect awareness.
• The aim of this paper is to explore the validity of this measure. Some
calculations suggested that, contrary to Kunimoto et al.’s intention, a’ can
vary as a result of changes in response bias affecting the relative
proportions of high- and low-confidence responses.
• This was not evident in the results of Kunimoto et al.’s original experiments
because their method may have artificially ‘clamped’ observers’ response
bias close to zero.
• A predicted consequence of allowing response bias to vary freely is that it
can result in a’ varying from negative, through zero, to positive values, for
a given value of discriminability (d’).
• We tested whether such variations are likely to occur in practice by
employing Kunimoto et al.’s paradigm with various modifications, notably
the removal of constraints upon the proportions of low- and high-
confidence responses, in a visual discrimination task.
• As predicted, a’ varied with response bias in all participants. Similar results
were found when a’ was calculated from pre-existing data obtained from a
patient with blindsight: a’ varied through a range of positive results without
approaching zero, which is inconsistent with his well-documented lack of
awareness.
• A second experiment showed how response bias could be manipulated to
yield elevated values of a’. On the basis of these findings we conclude that
Kunimoto’s measure is not as impervious to response bias as was
originally assumed.

We also want to discuss:
From Science Daily:
Chimpanzees learn rock-paper- scissors
New study shows that chimps' ability to learn simple circular relationships is on a
par with that of 4-year- old children
Date: August 10, 2017
Source: Springer

The Future of
Humans & Machines:
Partnership, Fusion, or Fear?
Summer 2017
The Intelligent Systems Center
Johns Hopkins Applied Physics Laboratory
arXiv:1705.08168v2 [cs.CV] 1 Aug 2017
Look, Listen and Learn
Relja Arandjelovi´cy
relja@google.com
Andrew Zissermany;_
zisserman@google.com
I read an interesting paper this week that is related to your question regarding "what is necessary for
such systems to form and share meaning?" This paper describes a way to combine learning from
different modalities (audio and visual) primarily using unsupervised methods.
In the attached paper, the authors trained an auditory and visual network based on unlabeled 1 second
video clips. They started by creating a training set where half the video clips did not have the true audio
associated with the visual portion of the clip (they randomly assigned a different clip's audio to the
visual portion of the clip). They then trained a single visual and auditory network (each part takes
separate inputs, but the network segments are joined later in the network) to make predictions on
whether the visual and auditory clips belong together (or if they were taken from different clips). Half of
the samples belonged together while the other half did not.
What made their results interesting was that the networks learned a representation without supervised
categorical class labels (playing piano, rock concert, sporting event). This learned representation,
however, was clearly associated with real-world concepts.

I think their results demonstrate a jump forward in the way we can train computers to learn
representations, because it shows that large supervised sets may not be needed to learn a
representation that is capable of understanding visual and auditory inputs. The network merely learned
to associate correlations between different modalities. I would contend this is a more "human"
approach to learning. We learn to associate correlations and then assign meaning to those associations.
It is also unique because they did not try to assign a label and then combine modalities based on the
labels. Rather, they combined the information in the representation space. Then they were able to tear
off the top layer and transfer the learned representation to a supervised setting and still get good results
in traditional classification problems.

There also was a AI colloquium recently –
The video from the AI Colloquium is now available. It can be found at:
https://vimeo.com/album/4721595.
one of the speakers with Prof Russell from Berkeley – he
pointed out that in the deep learning area there are
some obvious issues that many overlook – he used as an
example the picture below with the caption from Google:

Many look at this in amazement – others who are in
more traditional AI areas note:
There is no fruit stand, there is no group of people, there
is no one shopping going on –

news summary (66)

Categories: Uncategorized