Home > Uncategorized > Weekly QuEST Discussion Topics and News, 20 Feb

Weekly QuEST Discussion Topics and News, 20 Feb

We want to focus this week on the technical article that we discussed as a news story last week.


Google’s Brain-Inspired Software Describes What It Sees in Complex Images v2

Experimental Google software that can describe a complex scene could lead to better image search or apps to help the visually impaired.  *** I would extend to say if a machine based agent can generate a more expansive ‘meaning’ of a stimulus image or video then the deliberation that can be accomplished by that agent potentially greatly increases in value **

Why It Matters

Computers are usually far worse than humans at interpreting complex information, but new techniques are making them better.

Experimental software from Google can accurately describe scenes in photos, like the two on the left.But it still makes mistakes, as seen with the two photos on the right.

Researchers at Google have created software that can use complete sentences to accurately describe scenes shown in photos—a significant advance in the field of computer vision. When shown a photo of a game of ultimate Frisbee, for example, the software responded with the description “A group of young people playing a game of frisbee.” The software can even count, giving answers such as “Two pizzas sitting on top of a stove top oven.”

The technical article on the topic:

Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
Dumitru Erhan

  • Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.
  • In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image.
  • The model is trained to maximize the likelihood of the target description sentence given the training image
  • Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions.
  • Our model is often quite accurate, which we verify both qualitatively and quantitatively.
  • For instance, while the current state-of-the-art BLEU score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69.
  • We also show BLEU score improvements on Flickr30k, from 55 to 66, and on SBU, from 19 to 27.

news summary (8)

Categories: Uncategorized
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: