Weekly QuEST Discussion Topics and News, 18 Aug

August 17, 2017 Leave a comment

QuEST Aug 18 2017

Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman

  • Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition – introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is being sensed and what or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **
  • However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches
  • In this chapter we introduce type I and II signal detection theory (SDT), and describe and evaluate signal detection theoretic measures of metacognition. We discuss practicalities for empirical research with these measures, for example, alternative methods of transforming extreme data scores and of collecting confidence ratings, with the aim of encouraging the use of SDT in research on metacognition. We conclude by discussing metacognition in the context of consciousness.

 

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y …

Consciousness and Cognition 31 (2015) 139–147

  • Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition – introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is being sensed and what or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **
  • However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches

Post-decision wagering objectively measures awareness
Navindra Persaud, Peter McLeod & Alan Cowey

2007 Nature Publishing Group http://www.nature.com/natureneuroscience

NATURE NEUROSCIENCE VOLUME 10 [ NUMBER 2 [ FEBRUARY 2007

  • The lack of an accepted measure of awareness consciousness has made claims that accurate decisions can be made without awareness consciousness controversial. Here we introduce a new objective measure of awareness consciousness, post-decision wagering.
  • We show that participants fail to maximize cash earnings by wagering high following correct decisions in blindsight, the Iowa gambling task and an artificial grammar task.
  • This demonstrates, without the uncertainties associated with the conventional subjective measures of awareness consciousness(verbal reports and confidence ratings), that the participants were not aware that their decisions were correct.
  • Post-decision wagering may be used to study the neural correlates of consciousness.

We had some exciting work this summer and we want to give people quick overviews of some of the efforts – we had two deep learning efforts of particular notes by our colleagues Oliver and Washington and then there were at least four projects that might be good to bring up:

*Context-Learning Deep Neural Networks for Kinematic Prediction by Dr. Kyle Tarplee (Anderson Univ.)  – Combines a DNN that learns patterns in traffic to a mixture density network (MDN) that is a DNN designed for motion prediction (c.g. Kalman filters).

Estimating Posteriors from Deep Learning Networks by Nicole Eikmeier (Purdue) – Exploits the stochasticity of drop-out regularization to compute “confidence” values for the network’s performance.

Predictive Simulation for UAV Flight Planning by Saniyah Shaikh (UPenn) – Uses Monte-Carlo Tree Search (MCTS) a la AlphaGo to plan UAV flight paths.

Practical Applications of Graph Convolutional Neural Networks in Sensor Exploitation by Mela Hardin (ASU) – She presented highlights from some recent work in the hot field of graph CNNs, i.e. bringing the power of CNNs to relational data.

 

For those that have VDL access we have wiki pages with details for following up

news summary (65)

Advertisements
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 11 Aug

August 10, 2017 Leave a comment

QuEST 11Aug 2017

A recent study by the Harvard Kennedy School – Felfer Center for Science and International Affairs – “Artificial Intelligence and National Security” done for IARPA concluded among other things:

“By looking at four prior cases of transformative military technology—nuclear,

aerospace, cyber, and biotech—we develop lessons learned and recommendations for national security policy toward AI.

Future progress in AI has the potential to be a transformative

national security technology, on a par with nuclear weapons,

aircraft, computers, and biotech.

 

−− Each of these technologies led to significant changes in the

strategy, organization, priorities, and allocated resources of the

U.S. national security community.

−− We argue future progress in AI will be at least equally

impactful.”

 

—- that is an amazing statement – some discussion is warranted.  For example take the lessons from the AlphaGo and extrapolate to multi-domain Command and Control implications?

 

A second set of topics from our Colleague Teresa H:

 

How We Save Face—Researchers Crack the Brain’s Facial-Recognition Code

A Caltech team has deciphered the way we identify faces, re-creating what the brain sees from its electrical activity

 By Knvul Sheikh | Scientific American August 2017 Issue

 

The brain has evolved to recognize and remember many different faces. We can instantly identify a friend’s countenance among dozens in a crowded restaurant or on a busy street. And a brief glance tells us whether that person is excited or angry, happy or sad.

Brain-imaging studies have revealed that several blueberry-size regions in the temporal lobe—the area under the temple—specialize in responding to faces. Neuroscientists call these areas “face patches.” But neither brain scans nor clinical studies of patients with implanted electrodes explained exactly how the cells in these patches work.

The Code for Facial Identity in the Primate Brain

Authors

Le Chang, Doris Y. Tsao

Correspondence

lechang@caltech.edu (L.C.),

dortsao@caltech.edu (D.Y.T.)

In Brief

Facial identity is encoded via a remarkably simple neural code that relies

on the ability of neurons to distinguish facial features along specific axes in face space, disavowing the long-standing assumption that single face cells encode individual faces.

 

Cross-modal prediction changes the timing of conscious access
during the motion-induced blindness
Acer Y …

Consciousness and Cognition 31 (2015) 139–147

  • Metacognition, or “knowing that you know”, is a core component of consciousness. ** to many this is part of the definition – introspection ** Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. ** seem to distinguish decisions made based on what is being sensed and what or what is being thought about perceptual / conceptual – we would suggest that even perceptual decisions that are done consciously are put into the conceptual representation **
  • However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches

 

Inferences about Consciousness Using Subjective Reports of Confidence
Maxine Sherman

 

Metacognition, or “knowing that you know”, is a core component of consciousness. Insight into a perceptual or conceptual decision permits us to infer perceptual or conscious knowledge underlying that decision. However when assessing metacognitive performance care must be taken to avoid confounds from decisional and/or confidence biases. There has recently been substantial progress in this area and there now exist promising approaches. In this chapter we introduce type I and II signal detection theory (SDT), and describe and evaluate signal detection theoretic measures of metacognition. We discuss practicalities for empirical research with these measures, for example, alternative methods of transforming extreme data scores and of collecting confidence ratings, with the aim of encouraging the use of SDT in research on metacognition. We conclude by discussing metacognition in the context of consciousness.

 

news summary (64)

Categories: Uncategorized

Weekly QuEST Meeting Discussion Topics and News, 4 Aug

August 3, 2017 Leave a comment

QuEST 4 Aug 2017:

This week we will have a guest lecture by colleagues from UCLA to discuss the paper by Achille / Soatto UCLA, arXiv:1706.01350v1 [cs.LG] 5 Jun 2017

On the emergence of invariance and disentangling in deep representations

Lots of interesting analysis in this article but what caught my eye was the discussion on properties of representations:

  • In many applications, the observed data x is high dimensional (e.g., images or video), while the task y is low-dimensional, e.g., a label or a coarsely quantized location. ** what if the task was a simulation – that was stable, consistent and useful – low dimensional?**
  • For this reason, instead of working directly with x, we want to use a representation z that captures all the information the data x contains about the task y, while also being simpler than the data itself.  ** and are there a range of tasks y that can be serviced by a representation z – how do we address the tension between the representation and the tasks – how do we define what tasks can be serviced by a given representation?**
  • Ideally, such a representation should be
  • (a) sufficient for the task y, i.e. I(y; z) = I(y; x), so that information about y is not lostamong all sufficient representations, it should be
  • (b) minimal, i.e. I(z; x) is minimized, so that it retains as little about x as possible, simplifying the role of the classifier; finally, it should be
  • (c) invariant to the effect of nuisances I(z; n) = 0, so that decisions based on the representation z will not overfit to spurious correlations between nuisances n and labels y present in the training dataset
  • Assuming such a representation exists, it would not be unique, since any bijective function preserves all these properties.
  • We can use this fact to our advantage and further aim to make the representation
  • (d) maximally disentangled, i.e., TC(z) is minimal, where disentanglement is often measured as the correlation of the network weights… the paper appears to use total correlation, which is the (presumably one-sided) KL divergence between the joint PDF of the weights and the Naïve Bayes estimate à KL(f(w1, w2, …, wn), f(w1)f(w2)…f(wn))
  • This simplifies the classifier rule, since no information is present in the complicated higher-order correlations between the components of z, a.k.a. “features.”
  • In short, an ideal representation of the data is a minimal sufficient invariant representation that is disentangled.
  • Inferring a representation that satisfies all these properties may seem daunting. However, in this section we show that we only need to enforce (a) sufficiency and (b) minimality, from which invariance and disentanglement follow naturally.
  • Between this and the next section, we will then show that sufficiency and minimality of the learned representation can be promoted easily through implicit or explicit regularization during the training process.

As we mature our view of how to work to these rich representation it brings up the discussion point of QuEST as a platform:

 

I would like to think through a QuEST solution that is a platform that uses existing front ends (application dependent by observation vendors) and existing big-data back ends like standard Big Data Solutions such Amazon Web services … , and possibly a series of knowledge creation vendors  – It is helpful here to consider the Cross Industry Standard Process for Data Mining (commonly known by its acronym CRISPDM, is a data mining process model that describes commonly used steps data mining experts use to tackle data mining problems) to show how QuEST fits within, and can enable, all aspects of the CRISP-DM process.

Independent of the representation used by a front end system that captures the observables and provides them to the QuEST agent – it becomes the quest agent’s job to take them and create two uses for them – the first is put them in a form usable by a big-data solution (following CRISP-DM, this would entail the Data Understanding, and Data Preparation), but do so based on an understanding of the relevant QuEST model (CRISP-DM, Modeling), and in a way that supports CRISP-DM Business Understanding (e.g., perhaps infer it based on its ‘Sys2 Artificial Consciousness’ – the next piece) to find if there exists experiences stored – something close enough to them to provide the appropriate response when in the CRISP-DM Deployed phase – and the second form has to be consistent with our situated / simulation tenets – so they are provided to a ‘simulation’ system that attempts to ‘constrain’ the simulation that will generate the artificially conscious ‘imagined’ present that can complement the ‘big-data’ response – in fact the simulated data might be fed as ‘imagined observables’ into the back end, infer gaps in CRISP-DM Business Understanding that then also feed the big-data response, and offer more valuable contributions to users in CRISP-DM Deployment– I would like to expand on this discussionnews summary (63)

Categories: Uncategorized

No QUEST Meeting 28 July

There will be no QuEST again this week due to overlapping commitments.  We will pick back up next week 4 Aug 2017 with a guest lecture from colleagues from UCLA on advances in deep learning

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 14 July

QuEST 14 July 2017

There will not be a QuEST 21 July 2017 – cap travelling

Welcome back our essential colleague Cathy!  For those who experienced us going ‘comms-out’ my apologies but without Cathy I’m helpless.

We have two great things to work through this week.  First we want to continue the discussion we’ve been having via emails/meetings on cognitive flexibility.  Recall we’ve defined in the FAQ cognition = Cognition is the process of creating knowledge and understanding through thought, experience, and sensing.  Many of the AI/ML systems we build today have all the knowledge created for the systems ‘off-line’ and it provided at ‘birth’ (evolution).  Recall we define understanding and reasoning:

  • Understanding is an estimation of whether an AS’s (autonomous system) Meaning will result in it acceptably accomplishing a task
  • Reasoning is the ability to think about what is perceived in order to accomplish a task (thinking was the manipulation of the AS’s representation)

Our challenge is to define for the purpose of advancing our cognitive flexibility capabilities a set of ‘stages’ of cognitive flexibility.

The second topic this week is agile collaboration – I don’t envision the autonomy solution to be a big ‘gold-plated’ AI I suspect it will be result of a set of agents that can form and dissolve rapidly – the idea is to quickly and efficiently form collaboration teams of agents – each agent has its own observations sources (vendors / sensors) and its own knowledge that it has represented in whatever form it uses (and has the ability to create new knowledge) and thus can create its agent centric meaning and its own effects (its own effectors / vendors of effects – to include being able to generate observations for the other agents – possibly to include its meaning).  Since I’ve used the term collaboration there is an assumption of some tailored effect that is sought by one or more of the collaborative set of agents that as a system they will contribute to achieving.

That was a mouthful – so let’s talk through it on Friday – some of the material we will use to have the discussion includes:

arXiv:1605.07736v2 [cs.LG] 31 Oct 2016

29th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain

Learning Multiagent Communicationwith Backpropagation
Sainbayar Sukhbaatar
Dept. of Computer Science
Courant Institute, New York University
sainbar@cs.nyu.edu …

  • Many tasks in AI require the collaboration of multiple agents.
  • Typically, the communication protocol between agents is manually specified and not altered during training.
  • In this paper we explore a simple neural model, called CommNet, that uses continuous communication for fully cooperative tasks.
  • The model consists of multiple agents and the communication between them is learned alongside their policy. We apply this model to a diverse set of tasks, demonstrating the ability of the agents to learn to communicate amongst themselves, yielding improved performance over non-communicative agents and baselines.
  • In some cases, it is possible to interpret the language devised by the agents, revealing simple but effective strategies for solving the task at hand.

news summary (62)

Categories: Uncategorized

Weekly QuEST News and Discussion Topics, 7 July

QuEST 7 July 2017

We will have our colleague Igor T provide us an update on his research he has previously talked to us about:  Transparent Autonomous Hierarchical Learning using 3D Visualization Engine

Learning with visualization engine, can be used for real time complex situation assessment

Situations and objects successfully learned in high clutter environment;    Results can be fed to a visualization engine in real time  to greatly improve performance.

The algorithm gradually improves classification performance for both objects and situations.

If there is time remaining in the meeting Cap will revisit his Autonomy FAQ – the goal is to converge / refine some of the previously provided answers for a broader distribution – it was the core of his plenary talk at NASA last week and will also be included in an autonomy vision document so we want to give everyone a chance to chime in on the content

news summary (61)

Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 30 June

QuEST 30 June 2017

 

One interesting email thread from the week was with our colleague namita – so I provide some of her points / questions for consideration / discussion:

 

While there is a debate around whether or not artificial general intelligence is achievable, is there a debate on whether a general knowledge complexity representation is possible? ** cap would suggest you can’t have one without the other – representation is defined (by QuEST) as how an agent/ autonomous system structures its knowledge – if the autonomous system is an AGI then it had to have solved the challenge of a general purpose representation **

 

Just as there are different types of intelligence, there is diversity in knowledge and different perspectives on the knowledge at the agent level. ** QuEST would say intelligence is the ability to capture observations and create knowledge and appropriately use that knowledge later – and since there are different types of observations and different tasks agents attempt to do we don’t mind saying there are types of intelligence ** Just as synthesis of intelligence (AI, EI, human intelligence, network intelligence etc.) appears as distributed intelligence across agents, knowledge synthesis can arise from distributed knowledge across agents and plays out in complexity science. Globally, various entities are developing knowledge platforms to represent this diversity of knowledge in fundamental states before higher levels of abstraction or completely different states arise from knowledge emergence. ** a topic of real QuEST interest – we have avoided the ‘abstraction’ issue by our focus on qualia / situations – an abstraction is just a compound qualia made up of a bunch of more primitive qualia – but it is created / processes the same **

 

Will models continue to develop globally until one ultimate “base” model/representation appears? ** QuEST position is that is not going to be the case – BUT – QuEST also believes that all the models that are successful in general representation will have characteristics similar to our tenets ** Is this somehow connected to your idea of qualia and consciousness? ** yes – it is the one solution we know works – nature found it **

Will multiple models always exist and the need for multiple translators across them persist? ** yes just like we need google translate to communicate with other human critters – we will need translators – and some means of grounding terms **

Will models and thus translators continue to evolve as knowledge emerges? ** yes – just as your qualia set grows / changes – so will the vocabularies of these systems **

 

With respect to internal Air Force platforms and external actor platforms in various domains, how do we translate one knowledge representation into another? ** we don’t – what we do is facilitate making meaning relevant to our representation from observations we can get from the other systems – we develop a vocabulary of communication to evoke in those other agents aspects of their representation that is ‘similar’ to mine but really only has to result in the desired behavior – this gets sticky when the other system is attempting to deceive us** This is critical for intelligence to truly understand intent and threat across cultures/languages. How do we do this effectively without introducing error and loss of data? ** intent from activity is possible – but it takes the development of a pretty sophisticated theory of mind ** As you brought up today the problem with current approaches(in machine translation, big data etc. ) is that indexing, conditioning(OCR, ASR etc.) and processing may introduce error/loss at each step you move away from original raw data. These errors and loss compound with each new layer of processing. This not only limits inputs but also output -queries and further analytics on the data.

 

Eastern writings on the Data, Information, Knowledge, Wisdom, Enlightenment structures may be valuable to draw similarities and differences in representations. ** a translation from our use of those terms might be valuable ** The knowledge platform or as the Japanese say Knowledge “ba”(platform for knowledge creation, sharing, exploitation and interaction) can leverage many different models(SECI Spiral process, I5 System of Nakamori etc). Ikujiro Nonaka and Yoshiteru Nakamori have eastern perspectives on these topics. Knowledge Synthesis: Western and Eastern Cultural Perspectives and Knowledge Emergence cover some of these topics.

 

 

The incremental process chart you showed today was very helpful to practically achieve these bigger goals, and deliver something in the short term to customers for testing and feedback.  ** we believe this strategy to task is reasonable **

 

Another email thread this week is metrics for judging the goodness of machine translations / captions – it comes down to meaning – we will discuss Bleu / meteor …:

Difference Between Human and Machine

The idea behind BLEU is the closer a machine translation is to a professional human translation, the better it is. The BLEU score basically measures the difference between human and machine translation output, explained Will Lewis, Principal Technical Project Manager of the Microsoft Translator team.

In a 2013 interview with colleague Chris Wendt, Lewis said, “[BLEU] looks at the presence or absence of particular words, as well as the ordering and the degree of distortion—how much they actually are separated in the output.”

BLEU’s evaluation system requires two inputs: (i) a numerical translation closeness metric, which is then assigned and measured against (ii) a corpus of human reference translations.

Neural’s Advent May Spell Trouble for BLEU

As alternative metrics, Tinsley named METEOR, TER (Translation Edit Rate), and GTM (General Text Matcher). According to Tinsley, these have proven more effective for specific tasks (e.g., TER correlates better with post-editing effort). He said, “Most commercial MT providers will use all of these metrics, and maybe more when developing internally to get the full picture.”

Among these other metrics could be TAUS’ DQF (Dynamic Quality Framework), which offers bespoke benchmarking, albeit at a price point. But no matter how bespoke, it is not hard to argue that, as Tinsley pointed out, “There is obviously no substitute for manual evaluations.”

As Rico Sennrich said, “Human evaluations in the past have shown that BLEU systematically underestimates the quality of some translation systems, in particular, rule-based systems.”

Another topic – pursuing the thread that we need some means to generate the ‘imagined’ present/past/future – is associated with a relatively recent article on video prediction.

DEEP MULTI-SCALE VIDEO PREDICTION BEYOND

MEAN SQUARE ERROR

Michael Mathieu1, 2, Camille Couprie2 & Yann LeCun1,

arXiv:1511.05440v6 [cs.LG] 26 Feb 2016

 

ABSTRACT

Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning. In addition, while optical flow has been a very studied problem in computer vision for a long time, future frame prediction is rarely approached. Still, many vision applications could benefit from the knowledge of the next frames of videos, that does not require the complexity of tracking every pixel trajectory. In this work, we train a convolutional network to generate future frames given an input sequence. To deal with the inherently blurry predictions obtained from the standard Mean Squared Error (MSE) loss function, we propose three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function. We compare our predictions to different published results based on recurrent neural networks on the UCF101 dataset

 

Another email thread included an article that purports to lead the way to detecting / overcoming adversarial examples:

 

David Rudrauf, Daniel Bennequin, Isabela Granic, Gregory Landini,

Karl Friston, Kenneth Williford, A mathematical model of embodied consciousness, Journal of The-

oretical Biology (2017), doi: 10.1016/j.jtbi.2017.05.032

 

Abstract

We introduce a mathematical model of embodied consciousness, the Projective Consciousness Model

(PCM), which is based on the hypothesis that the spatial field of consciousness (FoC) is structured

by a projective geometry and under the control of a process of active inference. The FoC in the

PCM combines multisensory evidence with prior beliefs in memory, and frames them by selecting

points of view and perspectives according to preferences. The choice of projective frames governs

how expectations are transformed by consciousness. Violations of expectation are encoded as free

energy. Free energy minimization drives perspective taking, and controls the switch between

perception, imagination and action. In the PCM, consciousness functions as an algorithm for the

maximization of resilience, using projective perspective taking and imagination in order to escape

local minima of free energy. The PCM can explain a variety of psychological phenomena: the

manifestation of subjective experience with its characteristic spatial phenomenology, the distinctions

and integral relationships between perception, imagination, and action, the role of affective processes

in intentionality, but also perceptual phenomena such as the dynamics of bistable figures and body

swap illusions in virtual reality. It relates phenomenology to function, showing its computational and

adaptive advantages. It suggests that changes of brain states from unconscious to conscious reflect

the action of projective transformations, and suggests specific neurophenomenological hypotheses

about the brain, guidelines for designing artificial systems, and formal principles for psychology.

 

One last email thread:

 

Paper from MIT claims a universal defense against adversarial attacks is within reach. http://arxiv.org/pdf/1706.06083v1.pdf

 

Towards Deep Learning Models Resistant to Adversarial Attacks

——————————————————————————–

Recent work has demonstrated that neural networks are vulnerable to adversarial examples, i.e., inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete, general guarantee to provide. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. This suggests that adversarially resistant deep learning models might be within our reach after all.

news summary (60)

Categories: Uncategorized