Archive

Archive for December, 2016

Weekly QuEST Discussion Topics and News, 9 Dec

December 8, 2016 Leave a comment

QuEST 9 Dec 2016:

This time of year we traditionally review all the topics from the year – in an attempt to capture the big lessons to incorporate them into the Kabrisky lecture – the first QuEST meeting of any calendar year (will be 6 jan 2017) is the Kabrisky memorial lecture where we capture the current answer to ‘what is quest?’ – in honor of our late esteemed colleague Prof Matthew Kabrisky.

But this year we also have the task before the end of the calendar year to capture the answers to the questions that would lead to a ‘funded’ effort (either inside or outside the government) to build a QuEST agent (a conscious computer).

  • I will be formulating the ‘pitch’ before the end of the year – the pitch has to answer:

–     What is it we suggest?

–     How will we do what we suggest?

–      Why is it we could do this now? 

–     Why we are the right people to do it – what in our approach that is new/different?

–     What will be the result – if successful what will be different? 

–     How long will it take and what will it cost? 

–     What are our mid-term and final exams that will tell us/others we are proceeding successfully?

On the second topic first – this week I will continue giving my current ‘what’ answer and a first cut at the how/why now answer for the QuEST pitch.  The ‘what’ answer is wrapped around the idea of making a conscious computer (one that is emotionally intelligent and can increase the emotional intelligence of its human partners) as that is the key to group intelligence.  Last week I attempted to capture the ‘what’/’how’ but focused on the QuEST agent having a dual process and thus generating emotional intelligence with respect to its subconscious calculations.  This week we will focus on how to make the QuEST agent have emotional intelligence with respect to the subconscious calculations of the human partner that is attempting to make higher quality decisions.

The ‘how’ last week was wrapped around generating a gist of the representation that the machine agent generates for example in deep learning agents in a given application area – the idea being that deep learning (in fact all big data approaches) extracts and memorizes at far too high a resolution to be able to robustly respond to irrelevant variations in the stimuli – therefore we posit that via unsupervised processing of that representation used to do the output classification we will generate ‘gists’.  The idea is to use the ‘gists’ of those representation vectors to provide a lower bit view of what is necessary to get an acceptable accuracy.  My idea for the How is that new ‘gists’ vocabulary ~ qualia can be used as a vocabulary for a simulation (either GANs or RL based) to complement the higher resolution current deep learning answers.  Then the challenge will be to appropriately blend the two.  An alternative to blending is to use the qualia in a single pass cognitive system but where the bottom up data evoked activations are replaced by some set of the ‘imagined’ qualia.

Let’s assume instead of just using the action / behavior of the human we take some text input (via speech or typed).  So to be clear the human consumes the output of the machine learning solution in some space like a recommender system.  The human now does something.  For example the human buys / watches / reads / or clicks to another option.  Most machine learning recommender systems use this action and attempt to find correlations in the actions and thus capture a model of user responses that can be used later.  Now instead of just monitoring what the human did in response to the recommendation we also gather some information about how they felt via analysis of the text (either typed or spoken to the system or of course if available any human state sensing means).  Now we have a set of words / measurements that we can use to extract a set of emotional states to use in a model of the human that can be used by the QuEST agent.

I look forward to the discussion on this view of the what and the how.  The why now part of the how is centered around the spectacular recent breakthroughs in deep learning.  There are many applications – anytime a human uses a recommendation from a computer and needs to understand that recommendation so it can be used appropriately AND the recommender can be improved by a better understanding of the how the human felt about the prior recommendations.

So that leads to some definable steps – what to do and how specifically to proceed and how much will it cost and how long will it take – we have those in hand now – been a great week!  Next week Scott and I will be discussing this in NY with potential collaborators.  More to come on that – but for now assume there will NOT be a QuEST meeting on the 16th of Dec.

On the second topic, reviewing the material we covered this calendar year that should be considered for inclusion into the Kabrisky lecture series, I will briefly remind everyone of the major topics we hit early this calendar year that maybe need to be included in our Kabrisky lecture.

In March we hit the dynamic memory networks of MetaMind:

http://www.nytimes.com/2016/03/07/technology/taking-baby-steps-toward-software-that-reasons-like-humans.html?_r=0

Taking Baby Steps Toward Software That Reasons Like Humans

Bits

By JOHN MARKOFF MARCH 6, 2016

Richard Socher, founder and chief executive of MetaMind, a start-up developing artificial intelligence software. Credit Jim Wilson/The New York Times

Richard Socher appeared nervous as he waited for his artificial intelligence program to answer a simple question: “Is the tennis player wearing a cap?”

The word “processing” lingered on his laptop’s display for what felt like an eternity. Then the program offered the answer a human might have given instantly: “Yes.”

Mr. Socher, who clenched his fist to celebrate his small victory, is the founder of one of a torrent of Silicon Valley start-ups intent on pushing variations of a new generation of pattern recognition software, which, when combined with increasingly vast sets of data, is revitalizing the field of artificial intelligence.

His company MetaMind, which is in crowded offices just off the Stanford University campus in Palo Alto, Calif., was founded in 2014 with $8 million in financial backing from Marc Benioff, chief executive of the business software company Salesforce, and the venture capitalist Vinod Khosla.

MetaMind is now focusing on one of the most daunting challenges facing A.I. software.Computers are already on their way to identifying objects in digital images or converting sounds uttered by human voices into natural language. But the field of artificial intelligence has largely stumbled in giving computers the ability to reason in ways that mimic human thought.

Now a variety of machine intelligence software approaches known as “deep learning” or “deep neural nets” are taking baby steps toward solving problems like a human.

On Sunday, MetaMind published a paper describing advances its researchers have made in creating software capable of answering questions about the contents of both textual documents and digital images.

The new research is intriguing because it indicates that steady progress is being made toward “conversational” agents that can interact with humans. The MetaMind results also underscore how far researchers have to go to match human capabilities.

Other groups have previously made progress on discrete problems, but generalized systems that approach human levels of understanding and reasoning have not been developed.

Five years ago, IBM’s Watson system demonstrated that it was possible to outperform humans on “Jeopardy!”

Last year, Microsoft developed a “chatbot” program known as Xiaoice (pronounced Shao-ice) that is designed to engage humans in extended conversation on a diverse set of general topics.

To add to Xiaoice’s ability to offer realistic replies, the company developed a huge library of human question-and-answer interactions mined from social media sites in China. This made it possible for the program to respond convincingly to typed questions or statements from users.

In 2014, computer scientists at Google, Stanford and other research groups made significant advances in what is described as “scene understanding,” the ability to understand and describe a scene or picture in natural language, by combining the output of different types of deep neural net programs.

These programs were trained on images that humans had previously described. The approach made it possible for the software to examine a new image and describe it with a natural-language sentence.

While even machine vision is not yet a solved problem, steady, if incremental, progress continues to be made by start-ups like Mr. Socher’s; giant technology companies such as Facebook, Microsoft and Google; and dozens of research groups.

In their recent paper, the MetaMind researchers argue that the company’s approach, known as a dynamic memory network, holds out the possibility of simultaneously processing inputs including sound, sight and text. ** fusion **

The design of MetaMind software is evidence that neural network software technologies are becoming more sophisticated, in this case by adding the ability both to remember a sequence of statements and to focus on portions of an image. For example, a question like “What is the pattern on the cat’s fur on its tail?” might yield the answer “stripes” and show that the program had focused only on the cat’s tail to arrive at its answer.

“Another step toward really understanding images is, are you actually able to answer questions that have a right or wrong answer?” Mr. Socher said.

MetaMind is using the technology for commercial applications like automated customer support, he said. For example, insurance companies have asked if the MetaMind technology could respond to an email with an attached photo — perhaps of damage to a car or other property — he said.

There are two papers that we will use for the technical detail:

Ask Me Anything: Dynamic Memory Networks
for Natural Language Processing:

  • Most tasks in natural language processing can be cast into question answering (QA) problems over language input.  ** way we cast QuEST  Query response**
  • We introduce the dynamic memory network (DMN), a unified neural network framework which processes input sequences and questions, forms semantic and episodic memories, and generates relevant answers.
  • The DMN can be trained end-to-end and obtains state of the art results on several types of tasks and datasets:
  • question answering (Facebook’s bAbI dataset),
  • sequence modeling for part of speech tagging (WSJ-PTB),
  • and text classification for sentiment analysis (Stanford Sentiment Treebank).
  • The model relies exclusively on trained word vector representations and requires no string matching or manually engineered features.

 

The second paper:

Dynamic Memory Networks for Visual and Textual Question Answering
Xiong, Merity, Socher – arXiv:1603.01417v1 [cs.NE] 4 Mar 2016

  • Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering.
  • One such architecture, the dynamic memory network (DMN), obtained high accuracy on a variety of language tasks.

–     However, it was not shown whether the architecture achieves strong results for question answering when supporting facts are not marked during training or whether it could be applied to other modalities such as images.

–     Based on an analysis of the DMN, we propose several improvements to its memory and input modules.

–     Together with these changes we introduce a novel input module for images in order to be able to answer visual questions.

–     Our new DMN+ model improves the state of the art on both the

  • Visual Question Answering dataset and
  • the bAbI-10k text question-answering dataset without supporting fact supervision.

 

 

The topic this week is a discussion about the unexpected query – specifically ‘zero-shot learning’.  We will use an article by Socher / Manning / Ng from NIPS 2013:

Zero-Shot Learning Through Cross-Modal Transfer
Richard Socher, Milind Ganjoo, Christopher D. Manning, Andrew Y. Ng

  • This work introduces a model that can recognize objects in images even if no training data is available for the object class.
  • The only necessary knowledge about unseen visual categories comes from unsupervised text corpora.

Related to question of the unexpected query – but unexpected with respect to the image classification system – not to the word / text processing system – so a sort of  transfer learning issue – transfer between systems

  • Unlike previous zero-shot learning models, which can only differentiate between unseen classes, our model can operate on a mixture of seen and unseen classes, simultaneously obtaining state of the art performance on classes with thousands of training images and reasonable performance on unseen classes.
  • This is achieved by seeing the distributions of words in texts as a semantic space for understanding what objects look like.
  • Our deep learning model does not require any manually defined semantic or visual features for either words or images.
  • Images are mapped to be close to semantic word vectors corresponding to their classes, and the resulting image embeddings can be used to distinguish whether an image is of a seen or unseen class.
  • We then use novelty detection methods to differentiate unseen classes from seen classes.
  • We demonstrate two novelty detection strategies;
  • the first gives high accuracy on unseen classes,
  • while the second is conservative in its prediction of novelty and keeps the seen classes’ accuracy high.

Then there was our diving into the generative / adversarial networks:

UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL
GENERATIVE ADVERSARIAL NETWORKS
Alec Radford & Luke Metz
indico Research
Boston, MA
falec,lukeg@indico.io
Soumith Chintala
Facebook AI Research

  • In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.
  • Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.
  • We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.
  • Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.
  • Additionally, we use the learned features for novel tasks – demonstrating their applicability as general image representations.

So again the QuEST interest here is Imagine we use the generative model and use the data not just the weights to generate the data – that is imagine that our previous idea of a conscious system that is separate from the subconscious system is wrong – imagine one system – but with processes that populate the sensory BU paths being what we call conscious and subconscious –

Imagine that even as early as the visual cortex that much of the content is inferred and not measured by the visual sensing (eyes) – this seems to me to be testable – by electrode studies confirm/refute the idea that much of what is present even early in the visual chain of processing is inferred versus captured by the eyes – this could account for the 10:1 feedback versus feedforward connections –

Here is the implication – we take Bernard’s generative models – and have them generate additional information (competing with the bottom up sensory data for populating the agent’s world model) – and then the winning populated solution gets processed by a bottom up deep learning experienced based solution –

 

Note ‘blending’ is now only the competition of the top down imagined information and the bottom up sensory data – but the cognition is all in the bottom up processing of the resulting world model

 

In May we hit:

One topic I want to remind people – I’m extremely interested in applying QuEST ideas to social and medical issues – specifically what to do about inner city violence and how to do predictive intelligence (for predicting shock onset) – one article I will post for potential discussion is:

Am J Community Psychol (2009) 44:273–286

DOI 10.1007/s10464-009-9268-2

Researching a Local Heroin Market as a Complex Adaptive

System

Lee D. Hoffer • Georgiy Bobashev •

Robert J. Morris

Abstract This project applies agent-based modeling (ABM) techniques to better understand the operation, organization, and structure of a local heroin market. The simulation detailed was developed using data from an 18- month ethnographic case study. The original research, collected in Denver, CO during the 1990s, represents the historic account of users and dealers who operated in the Larimer area heroin market. Working together, the authors studied the behaviors of customers, private dealers, streetsellers, brokers, and the police, reflecting the core elements pertaining to how the market operated. After evaluating the logical consistency between the data and agent behaviors, simulations scaled-up interactions to observe their aggregated outcomes. While the concept and findings from this study remain experimental, these methods represent a novel way in which to understand illicit drug markets and the dynamic adaptations and outcomes they generate. Extensions of this research perspective, as well as its strengths and limitations, are discussed.

And also the work of our colleague Sandy V:

A Novel Machine Learning Classifier Based on a Qualia Modeling Agent (QMA)

 

This dissertation addresses a problem found in standard machine learning (ML) supervised classifiers, that the target variable, i.e., the variable a classifier predicts, has to be identified before training begins and cannot change during training and testing. This research develops a computational agent, which overcomes this problem.

 

The Qualia Modeling Agent (QMA) is modeled after two cognitive theories:

Stanovich’s tripartite framework, which proposes learning results from interactions between conscious and unconscious processes; and, the Integrated Information Theory (IIT) of Consciousness, which proposes that the fundamental structural elements of consciousness are qualia.

 

By modeling the informational relationships of qualia, the QMA allows for retaining and reasoning-over data sets in a non-ontological, non-hierarchical qualia space (QS). This novel computational approach supports concept drift, by allowing the target variable to change ad infinitum without re-training, resulting in a novel Transfer Learning (TL) methodology, while achieving classification accuracy comparable to or greater than benchmark classifiers. Additionally, the research produced a functioning

model of Stanovich’s framework, and a computationally tractable working solution for a representation of qualia, which when exposed to new examples, is able to match the causal structure and generate new inferences.

news-summary-36

Advertisements
Categories: Uncategorized

Weekly QuEST Discussion Topics and News, 2 Dec

December 1, 2016 Leave a comment

QuEST 2 Dec 2016:

Sorry it has been a while – we have a lot to do – this time of year we traditionally review all the topics from the year – in an attempt to capture the big lessons to incorporate them into the Kabrisky lecture – the first QuEST meeting of any calendar year (will be 6 jan 2017) is the Kabrisky memorial lecture where we capture the current answer to ‘what is quest?’ – in honor of our late esteemed colleague Prof Matthew Kabrisky.  But this year we also have the task before the end of the calendar year to capture the answers to the questions that would lead to a ‘funded’ effort (either inside or outside the government) to build a QuEST agent (a conscious computer).

  • I will be formulating the ‘pitch’ before the end of the year – the pitch has to answer:

–     What is it we suggest?

–     How will we do what we suggest?

–      Why is it we could do this now? 

–     Why we are the right people to do it – what in our approach that is new/different?

–     What will be the result – if successful what will be different? 

–     How long will it take and what will it cost? 

–     What are our mid-term and final exams that will tell us/others we are proceeding successfully?

Each (the pitch and reviewing the material we did during the calendar year 2016) individually is a daunting task in the short time left with the holidays approaching – but we will boldly go where we have never gone and will do both in the remaining weeks of the calendar year.

On the second topic first – this week I will give my current ‘what’ answer and a first cut at the how/why now answer for the QuEST pitch.  The ‘what’ answer is wrapped around the idea of making a conscious computer (one that is emotionally intelligent and can increase the emotional intelligence of its human partners) as that is the key to group intelligence.

The ‘how’ answer is wrapped around generating a gist of the representation that deep learning converges to in a given application area – the idea being that deep learning (in fact all big data approaches) extracts and memorizes at far too high a resolution to be able to robustly respond to irrelevant variations in the stimuli – via unsupervised processing of that representation used to do the output classification we will generate ‘gists’.  The idea is to use the ‘gists’ of those representation vectors to provide a lower bit view of what is necessary to get an acceptable accuracy.  My idea for the How is that new ‘gists’ vocabulary ~ qualia can be used as a vocabulary for a simulation (either GANs or RL based) to complement the higher resolution current deep learning answers.  Then the challenge will be to appropriately blend the two.  I look forward to the discussion on this view of the what and the how.  The why now part of the how is centered around the spectacular recent breakthroughs in deep learning.

On the second topic, reviewing the material we covered this calendar year that should be considered for inclusion into the Kabrisky lecture series, I will briefly remind everyone of the major topics we hit early this calendar year that maybe need to be included in our Kabrisky lecture.

For example this year we covered the Deep Mind breakthrough in January.  We might go back over its implication and how it was accomplished.  This came up several times during the year so instead of sticking with the linear approach (being chronologically faithful) to reviewing we will attempt to hit all  the related topics we hit throughout the year – so on Deep mind we started with the Atari material:

Deep Mind article on Deep Reinforcement Learning.  arXiv:1312.5602v1 [cs.LG] 19 Dec 2013.  Playing Atari with Deep Reinforcement Learning:

 

Abstract:  We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.

Later in the year we hit the changes necessary for the AlphaGo effort:

http://www.bloomberg.com/news/articles/2016-01-27/google-computers-defeat-human-players-at-2-500-year-old-board-game

Google Computers Defeat Human Players at 2,500-Year-Old Board Game

The seemingly uncrackable Chinese game Go has finally met its match: a machine.

Jack Clark mappingbabel

January 27, 2016 — 1:00 PM EST

Computers have learned to master backgammon, chess, and Atari’s Breakout, but one game has always eluded them. It’s a Chinese board game called Go invented more than 2,500 years ago. The artificial-intelligence challenge has piqued the interest of researchers at Google and Facebook, and the search giant has recently made a breakthrough.

Google has developed the first AI software that learns to play Go and is able to beat some professional human players, according to an article to be published Wednesday in the science journal Nature. Google DeepMind, the London research group behind the project, is now getting the software ready for a competition in Seoul against the world’s best Go player in March.

The event harks back to the highly publicized chess match in 1996 when IBM’s Deep Blue computer defeated the world chess champion. However, Go is a much more complex game. It typically consists of a 19-by-19-square board, where players attempt to capture empty areas and surround an opponent’s pieces. Whereas chess offers some 20 possible choices per move, Go has about 200, said Demis Hassabis, co-founder of Google DeepMind. “There’s still a lot of uncertainty over this match, whether we win,” he said. IBM demonstrated the phenomenal processing power available to modern computers. DeepMind should highlight how these phenomenally powerful machines are beginning to think in a more human way.

Also from deep mind in this week’s QuEST news we provide a story that discusses a topic we’ve been pursuing – dreaming as part of the solution to more robust performance:

Google’s DeepMind AI gives robots the ability to dream

Following in the wake of recent neuroscientific discoveries revealing the importance of dreams for memory consolidation, Google’s AI company DeepMind is pioneering a new technology which allows robots to dream in order to improve their rate of learning.  Not surprisingly given the company behind the project, the substance of these AI dreams consists primarily of scenes from Atari Video games. DeepMind’s earliest success involved teaching AI to play ancient videos games like Breakout and Asteroids.  But the end game here is for robots to dream about much the same things humans do – challenging real world situations that play important roles in learning and memory formation.

To understand the importance of dreaming for robots, it’s useful to understand how dreams function in mammalian minds such as our own (assuming the ET readership doesn’t include any aliens eavesdropping on the tech journalism scene). One of the primary discoveries scientists made when seeking to understand the role of dreams from a neuroscientific perspective was that the content of dreams is  primarily negative or threatening.  Try keeping a dream journal for a month and you will likely find your dreams consist inordinately of threatening or awkward situations. It turns out the age old nightmare of turning up to school naked is the rule rather than the exception when it comes to dreams. Such inordinate negative content makes little sense until viewed through the lens of neuroscience. One of the leading theories from this fields posits that dreams strengthen the neuronal traces of recent events. It could be that negative or threatening feelings encountered in the dream help to lodge memories deeper into the brain, thereby enhancing memory formation.  DeepMind is using dreams in a parallel fashion, accelerating the rate at which an AI learns by focusing on the negative or challenging content of a situation within a game.

So what might a challenging situation look like for a robot? At the moment  the world’s most sophisticated AI’s are just cutting their teeth on more sophisticated video games like Starcraft II and Labyrinth, so a threatening situation might consist of a particularly challenging Boss opponent, or a tricky section of a maze. Rather than pointlessly rehearsing entire sections of the game that have little bearing on the player’s overall score, “dreams” allow the AI to highlight certain sections of the game that are especially challenging and repeat them ad nauseam until expertise is achieved.  Using this technique, the researchers at DeepMind were able to achieve an impressive 10x speed increase in the rate of learning.

A snapshot of the method published by the DeepMind researchers to enable AI “dreams”. Image courtesy of Deepmind.

So we want to hit all this material from the perspective of its implications in the ‘how’ for building a conscious computer.

Another topic we hit early in 2016 was deep compositional captioning:

I’ve been poking around in the DCC paper that Andres sent us the link to (arXiv:1511.05284v1 [cs.CV] 17 Nov 2015 )–

  • I had great hope for it to give me insight into our fundamental challenge – the unexpected query – and it could be they did but as far as I can currently tell (I intend to spend more time digging through it) – they point out that current caption systems just spit back out previously learned associations (image – caption pairs) – when you don’t have something in your training set (image caption pairs) that can account for the meaning of a test image or video snippet you lose – cause it will give you its best previously experienced linguistic expression from the image-caption pair data!

The major implication from this material is where the state of the art is on the unexpected query.

We also in feb hit Hypnosis:

This popular representation bears little resemblance to actual hypnotism, of course. In fact, modern understanding of hypnosis contradicts this conception on several key points. Subjects in a hypnotic trance are not slaves to their “masters” — they have absolute free will. And they’re not really in a semi-sleep state — they’re actuallyhyperattentive*** my suspicion is that this is really associated with the sensitivity to suggestions as if they are real sensory data and true – my hypothesis is that the hypnotist is providing input to the subject’s sys1 – facts as if they are true – then the subject forms a narrative to make them corroborated/confirms ***

 

A twist to maybe discuss this week:  ‘using knowledge/quest model for how hypnosis works as a unique twist to human computer collaboration’ – the idea is we’ve proposed QuEST agents could better be ‘wingman’ solutions since they will be constructed with two system approaches (subconscious and an artificial conscious) – the key to having them ‘hyperalign’ in both directions (to the human and the human to the computer) is to use the lessons from our view of hypnosis – this could overcome the current bandwidth limit where humans and computers interfaces are all designed to only work through conscious manipulation of the interfaces – the idea is to facilitate the human to be able to directly impact the computer’s subconscious as well as the conscious interface connection and similarly in reverse (this will be very controversial – to in some sense hypnotize the human partner to facilitate directly connecting to the human’s subconscious)

 

Another topic we hit early in the year was RNNs:

The Neural Network That Remembers

With short-term memory, recurrent neural networks gain some amazing abilities

Bottom’s Up: A standard feed-forward network has the input at the bottom. The base layer feeds into a hidden layer, which in turn feeds into the output.

 

Loop the Loop: A recurrent neural network includes connections between neurons in the hidden layer [yellow arrows], some of which feed back on themselves.

Time After Time: The added connections in the hidden layer link one time step with the next, which is seen more clearly when the network is “unfolded” in time.

 

Again we want to ensure we have captured the implications for the QuEST conscious computer.  That led us to:

In March we hit the dynamic memory networks of MetaMind:

The Dynamic Memory Network, out of MetaMind will be discussed.  Although we started this discussion two weeks ago – the importance of their effort warrants a more in depth consideration for its implications to QuEST.

http://www.nytimes.com/2016/03/07/technology/taking-baby-steps-toward-software-that-reasons-like-humans.html?_r=0

Taking Baby Steps Toward Software That Reasons Like Humans

Bits

By JOHN MARKOFF MARCH 6, 2016

Richard Socher, founder and chief executive of MetaMind, a start-up developing artificial intelligence software. Credit Jim Wilson/The New York Times

Richard Socher appeared nervous as he waited for his artificial intelligence program to answer a simple question: “Is the tennis player wearing a cap?”

The word “processing” lingered on his laptop’s display for what felt like an eternity. Then the program offered the answer a human might have given instantly: “Yes.”

Mr. Socher, who clenched his fist to celebrate his small victory, is the founder of one of a torrent of Silicon Valley start-ups intent on pushing variations of a new generation of pattern recognition software, which, when combined with increasingly vast sets of data, is revitalizing the field of artificial intelligence.

His company MetaMind, which is in crowded offices just off the Stanford University campus in Palo Alto, Calif., was founded in 2014 with $8 million in financial backing from Marc Benioff, chief executive of the business software company Salesforce, and the venture capitalist Vinod Khosla.

MetaMind is now focusing on one of the most daunting challenges facing A.I. software. Computers are already on their way to identifying objects in digital images or converting sounds uttered by human voices into natural language. But the field of artificial intelligence has largely stumbled in giving computers the ability to reason in ways that mimic human thought.

Now a variety of machine intelligence software approaches known as “deep learning” or “deep neural nets” are taking baby steps toward solving problems like a human.

On Sunday, MetaMind published a paper describing advances its researchers have made in creating software capable of answering questions about the contents of both textual documents and digital images.

The new research is intriguing because it indicates that steady progress is being made toward “conversational” agents that can interact with humans. The MetaMind results also underscore how far researchers have to go to match human capabilities.

Other groups have previously made progress on discrete problems, but generalized systems that approach human levels of understanding and reasoning have not been developed.

news-summary-35

Categories: Uncategorized