1.) The first topic is a discussion led by our colleague Kirk W. – on the topic of autonomy free will and situation representation. We also posted an article that he provided on panexperientialist by Charles Birch, ‘Why I became a Panexperientialist.’ He promises to take this slow – like the discussion last week led by our colleague Sean M. on ‘non-local aspects of consciousness’ I need to establish an understanding of a position I can relate to (whether I agree with it or not) to allow me to understand how QuEST relates or potentially could relate. Keep in mind QuEST is an effort to engineer solutions that are instantiations of insights into consciousness – and those insights may or may not really explain consciousness BUT they are demonstrated through objective assessment to provide an engineering advantage over existing alternative engineering solutions.
2.) The second topic is a brief mention of an article on Behavioral Finance by Benartzi – it is a nice reference of the application of ‘two minds’ view of human cognition to solve real problems/understand real issues in the world of finance.
3.) The last topic is some more articles on transfer learning. Recall the QuEST interest is in using transfer learning ideas to attack what we’ve termed the unexpected query. One of our colleagues Olga M-S is working on her dissertation in the area and has a nice set of references on the topic on the VDL for those interested. She has offered to give us a talk and also we will have our colleague ‘Sam’ S. come by to give us a talk on transfer learning in deep learning systems in a couple of weeks. In the meantime I want to point those interested to a couple of noteworthy articles – specifically a 2010 survey article by Pan – A Survey of Transfer Learning, IEEE transactions on knowledge and data engineering, vol 22, no 10, oct 2010.
QuEST 20 March 2015
1.) I want to start this week with a discussion on the point our colleague Mike Y was making – “we are NOT talking about consciousness” – I will tee up this discussion by briefly revisiting an IEEE Spectrum article from 2008 ‘Can Machines be conscious?’ Koch and Tononi and also the review by John Searle of the Koch book Consciousness: Confessions of a Romantic Reductionist by Christof Koch – the title of the book review by Searle is Can Information Theory Explain Consciousness? In the review Searle writes:
The problem of consciousness remains with us. What exactly is it and why is it still with us? The single most important question is: How exactly do neurobiological processes in the brain cause human and animal consciousness? Related problems are: How exactly is consciousness realized in the brain? That is, where is it and how does it exist in the brain? Also, how does it function causally in our behavior?
To answer these questions we have to ask: What is it? Without attempting an elaborate definition, we can say the central feature of consciousness is that for any conscious state there is something that it feels like to be in that state, some qualitative character to the state. For example, the qualitative character of drinking beer is different from that of listening to music or thinking about your income tax. This qualitative character is subjective in that it only exists as experienced by a human or animal subject. It has a subjective or first-person existence (or “ontology”), unlike mountains, molecules, and tectonic plates that have an objective or third-person existence. Furthermore, qualitative subjectivity always comes to us as part of a unified conscious field. At any moment you do not just experience the sound of the music and the taste of the beer, but you have both as part of a single, unified conscious field, a subjective awareness of the total conscious experience. So the feature we are trying to explain is qualitative, unified subjectivity.
That review resulted in a reply from the authors:
Can a Photodiode Be Conscious?
Christof Koch and Giulio Tononi, reply by John R. Searle
Can Information Theory Explain Consciousness? from the January 10, 2013 issue
To the Editors:
The heart of John Searle’s criticism in his review of Consciousness: Confessions of a Romantic Reductionist [NYR, January 10] is thatwhile information depends on an external observer, consciousness is ontologically subjective and observer-independent. *** I’m conscious whether you think I am or not *** That is to say, experience exists as an absolute fact, not relative to an observer: as recognized by Descartes, je pense donc je suis is an undeniable certainty. Instead, the information of Claude Shannon’s theory of communication is always observer-relative: signals are communicated over a channel more or less efficiently, but their meaning is in the eye of the beholder, not in the signals themselves. So, thinks Searle, a theory with the word “information” in it, like the integrated information theory (IIT) discussed in Confessions, cannot possibly begin to explain consciousness.
Except for the minute detail that the starting point of IIT is exactly the same as Searle’s! Consciousness exists and is observer-independent, says IIT, and it is both integrated (each experience is unified) and informative (each experience is what it is by differing, in its particular way, from trillions of other experiences). IIT introduces a novel, non-Shannonian notion of information—integrated information—which can be measured as “differences that make a difference” to a system from its intrinsic perspective, ** very similar to our definition of meaning ** not relative to an observer. *** reasonable answer to Searle – but has some holes *** Such a novel notion of information is necessary for quantifying and characterizing consciousness as it is generated by brains and perhaps, one day, by machines.
And it also led to a sequence of emails between Capt Amerika / Mike Y / Bob E – we might want to spend a little time discussing the points as it forces us to attempt to more clearly state what we are doing in the QuEST group.
2.) The second topic I would like to hit is on transfer learning – We are particularly interested in QuEST on the ability to design agents that can respond when the knowledge of the environment (awareness) and/or the applicability of the current inference model (or models) is not appropriate for the environmental state. These are the sources of unexpected queries and we deem being able to respond acceptably to unexpected queries is required for meaningful autonomy.
With respect to revising the inference models during execution one approach might be ‘transfer learning’ – this document begins a discussion on the topic. The topic was a result of a sequence of interactions with our colleague Dean W.
In addition to talking about this topic in general we would also like to point to some particularly interesting work in this space – example NIPS article
Is Learning The Nn-th Thing Any Easier Than Learning The First? Sebastian Thrun1Computer Science Department Carnegie Mellon University
This paper investigates learning in a lifelong context. Lifelong learning
addresses situations in which a learner faces a whole stream of learning
tasks. Such scenarios provide the opportunity to transfer knowledge
across multiple learning tasks, in order to generalize more accurately from
less training data. In this paper, several different approaches to lifelong
learning are described, and applied in an object recognition domain. It
is shown that across the board, lifelong learning approaches generalize
consistently more accurately from less training data, by their ability to
transfer knowledge across learning tasks.
QuEST 13 March 2015
1.) After the meeting last week there was a series of virtual discussions that I want to review – for example I never liked the position on behavior – chess playing etc – as soon as something is achieved the goal post is moved – so I want to revisit my position I took during quest today – Deep mind (with Capt Amerika as the evaluating agent) IS AUTONOMOUS for the task of learning a model to allow it to play some arbitrary Atari game AND (with Capt Amerika as the evaluating agent) IS AUTONOMOUS for the task of playing an Atari game – It isNOT autonomous with respect to playing an Atari game that it hasn’t accomplished the generation of a model for (an unexpected query but a query that some forms of autonomous agents in this domain might be able to acceptably respond to) – NOTE how humans who have developed internal models for Atari game can immediately take on a game and function to some level of performance without the extensive learning period – so the transfer learning of the human Atari player is far better – Incidently looking at its performance on some of the games I might not give the deep mind solution my proxy – thus would not meet my level of acceptable performance so it would not be autonomous from my perspective for those games – Now the next questions should be for Deep Mind what of our tenets did it have to implement to be able to achieve autonomy for those tasks – as you point out it generates hypothetical ‘imagined’ next states and refines its models until it can reliably predict it’s score resulting from a particular input output pair – Is its representation situated – probably yes – the pixels relative locations are maintained and the association with the output score is maintained and certainly it is structurally coherent the way it closes the loop with its re-enforcement learning with reality- INTERESTING – … there are more points in this email chain…
2.) Along a similar line there was the full article ‘An Introduction to Autonomy in Weapon System by Scharre and Horozitz – I want to review some of the topics in that article to include the definitions for autonomous weapons systems – we want to discuss these definitions for their applicability or their potential modification for use in our chapter on cyber autonomy with respect to offensive cyber operations –
3.) Next we want to discuss Sequence to Sequence Learning with Neural Networks by Ilya Sutskever from Google: Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-TermMemory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from theWMT’14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM’s BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM’s performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier… we want to brainstorm on the applicability of the approach for processing cyber big data.
4.) There is also an article – The Mystery Behind Anesthesia – Mapping how our neural circuits change under the influence of anesthesia could shed light on one of neuroscience’s most perplexing riddles: consciousness… by Courtney Humphries in MIT Technology Review
QuEST 6 March 2015
This week we have several discussions we would like to have to allow use of our diversity by those attempting to understand some material that we’ve been considering.
1.) The first topic is a proposed DARPA effort on CwC – communicating with computers. See for example http://www.foxnews.com/tech/2015/03/02/chatty-machines-future-computers-could-communicate-like-humans/ Chatty machines? Future computers could communicate like humans – In the future, you might be able to talk to computers and robots the same way you talk to your friends. Researchers are trying to break down the language barrier between humans and computers, as part of a new program from the Defense Advanced Projects Agency (DARPA), which is responsible for developing new technologies for the U.S. military.The program — dubbed Communicating with Computers (CwC) — aims to get computers to express themselves more like humans by enabling them to use spoken language, facial expressions and gestures to communicate…. One of the problem-solving technologies that CwC could help further is the computer-based modeling used in cancer research. Computers previously developed by DARPA are already tasked with creating models of the complicated molecular processes that cause cells to become cancerous. But while these computers can churn out models quickly, they’re not so adept at judging if the models are actually plausible and worthy of further research. If the computers could somehow seek the opinions of flesh-and-blood biologists, the work they do would likely be more useful for cancer researchers….** sounds like the work we’ve discussed on joint cognitive systems ** … To get computers up to the task of communicating with people, CwC researchers have devised several tasks that require computers and humans to work together toward a common goal. One of the tasks, known as“collaborative composition,” involves storytelling. In this exercise, humans and computers take turns contributing sentences until they’ve composed a short story… ** this seems intimately related to our interest in narratives ** … Anotherassignment that the CwC is planning is known as “block world,” which would require humans and computers to communicate to build structures out of toy blocks. There’s a tricky part, though: neither humans nor computers will be told what to build. Instead, they’ll have to work together to make a structure that can stand up of its own accord…. Better communications technologies could help robot operators use natural language to describe missions and give directions to the machines they operate both before and during operations. And in addition to making life easier for human operators, CwC could make it possible for robots to request advice or information from humans when they get into sticky situations. ** this last area is exactly where we’ve positioned QuEST solutions to help for example human analyst as a team-mate – this is where our search for a Theory of Knowledge – what can be known by a set of agents (humans / computers) with respect to some situation in the environment being considered and a task that is sought to be accomplished **
2.) Next I would like to have a discussion on any questions that were left over after the guest lecture by Prof Cybenko last week. Specifically we had a series of virtual interactions discussing the application of the approach to cyber / EW so I want to ensure if there are other questions / points we need to clear up we allow that to run its course. So the priming question is how to use the Cybenko formalism for ‘deep learning of behaviors’ in cyber/EW and how can a QuEST formalism complement such an implementation. We would also like to extend this conversation to include the weeks’ worth of Deep Learning material we’ve covered to speculate how we envision Deep learning / big data approaches to communicate/complement QuEST solutions.
3.) Other topics I want to release my notes on that we’ve had from prior meetings to orient those who are looking for them include:
- Deep visual semantic alignments for generating image descriptions – the work out of Stanford similar to the work we discussed by Google on describing images in natural language. Some of the subtle approach differences that I want people to be aware of include the use of bi-directional recurrent neural networks – so I want to say a word about them so people who follow that path has some high level understanding of the their use.
- We also were able to pull down a technical article on the topic of the ‘automatic statistician’ – Automatic Construction of Natural-Language Description of Nonparametric Regression Models. – I would like to give my take on that work for those who want to pursue it –
- We were also able to pull down a technical article on the Deep Mind work – that is the use of deep reinforcement learning demonstrated by providing inputs of pixels from a video game and resultant scores and the system learned to play a range of games.
- Lastly we have an article on Graph Based Data mining – we’ve attempted to cover this before – now I just want to release the notes so people can review for consideration if applicable to your respective issues -
QuEST 27 Feb 2015
This week it is our honor to have Prof George Cybenko from Dartmouth leading a discussion on his work related to the topic we have been discussing for the last several weeks – deep learning.
Deep Learning of Behaviors for Security
Abstract: Deep learning has generated much research and commercialization interest recently. In a way, it is the third incarnation of neural networks as pattern classifiers, using insightful algorithms and architectures that act as unsupervised auto-encoders which learn hierarchies of features in a dataset. After a short review of that work, we will discuss computational approaches for deep learning of behaviors as opposed to just static patterns. Our approach is based on structured non-negative matrix factorizations of matrices that encode observation frequencies of behaviors. Example security applications and covert channel detection and coding will be presented.
If time allows I’ve also asked our colleague Ox to present an quick overview of a math formalism that might be applicable for our need for measuring similarity and potentially inferring content (inference) in our instantiations of qualia in our QuEST agents.
We want to focus this week on the technical article that we discussed as a news story last week.
Google’s Brain-Inspired Software Describes What It Sees in Complex Images v2
Experimental Google software that can describe a complex scene could lead to better image search or apps to help the visually impaired. *** I would extend to say if a machine based agent can generate a more expansive ‘meaning’ of a stimulus image or video then the deliberation that can be accomplished by that agent potentially greatly increases in value **
- By Tom Simonite on November 18, 2014
Why It Matters
Computers are usually far worse than humans at interpreting complex information, but new techniques are making them better.
Experimental software from Google can accurately describe scenes in photos, like the two on the left.But it still makes mistakes, as seen with the two photos on the right.
Researchers at Google have created software that can use complete sentences to accurately describe scenes shown in photos—a significant advance in the field of computer vision. When shown a photo of a game of ultimate Frisbee, for example, the software responded with the description “A group of young people playing a game of frisbee.” The software can even count, giving answers such as “Two pizzas sitting on top of a stove top oven.”
The technical article on the topic:
Show and Tell: A Neural Image Caption Generator
- Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.
- In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image.
- The model is trained to maximize the likelihood of the target description sentence given the training image
- Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions.
- Our model is often quite accurate, which we verify both qualitatively and quantitatively.
- For instance, while the current state-of-the-art BLEU score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69.
- We also show BLEU score improvements on Flickr30k, from 55 to 66, and on SBU, from 19 to 27.
We have spent considerable bandwidth defining meaning for the purpose of understanding what an artificially conscious agent might generate for the meaning of a stimulus that could complement current approaches to making intelligent machine based agents. Those discussions lead us back to (as well as the off-cycle discussions with the QuEST research students at AFIT and discussions between Capt Amerika and Andres R.) a discussion on what is consciousness and how will we know if a machine (or for that matter if a particular critter) is conscious. We have mentioned our think piece on ‘What Alan Turing meant to say’ for example. To address this question in a different way I propose we return to some previously discussed topics/articles.
First there is the IEEE Spectrum article from June 2008 by Koch / Tononi ‘Can machines be Conscious? Yes – and a new Turing Test might prove it’. In that article the authors conclude:
- Consciousness is part of the natural world. It depends, we believe, only on mathematics and logic and on the imperfectly known laws of physics, chemistry, and biology; it does not arise from some magical or otherworldly quality.
- That’s good news, because it means there’s no reason why consciousness can’t be reproduced in a machine—in theory, anyway.
They start by explaining what they believe consciousness does NOT require:
- Remarkably, consciousness does not seem to require many of the things we associate most deeply with being human: emotions, memory, self-reflection, language, sensing the world, and acting in it.
We want to discuss these points. They then adopt the approach championed by one of them Tononi:
- To be conscious, then, you need to be a single integrated entity with a large repertoire of states.
- Let’s take this one step further: your level of consciousness has to do with how much integrated information you can generate.
- That’s why you have a higher level of consciousness than a tree frog or a supercomputer.
Whether we adopt the Tononi formalism or not I like the idea of the amount of integrated information being related to the level of consciousness. That resonates with many of our ideas. In my mind I map ‘integrated’ to situated. So the more of the contributing processes we can situate the more exformation can be generated and thus the more power such a representation can bring to deliberation.
They then go on to define a revised Turing Test:
- One test would be to ask the machine to describe a scene in a way that efficiently differentiates the scene’s key features from the immense range of other possible scenes.
– Humans are fantastically good at this: presented with a photo, a painting, or a frame from a movie, a normal adult can describe what’s going on, no matter how bizarre or novel the image is.
One of the reasons I want to review this position is because of the: