SBU Captions Explorer

The SBU Captions Dataset contains 1 million images with captions obtained from Flickr circa 2011 as documented in Ordonez, Kulkarni, and Berg. NeurIPS 2011. These are captions written by real users, pre-filtered by keeping only captions that have at least two nouns, a noun-verb pair, or a verb-adjective pair. They also exclude many noisy captions and trivial captions. The final set still contains noise which might be significant for some use cases, nevertheless this dataset has been used for research purposes for several tasks e.g. Google's Show-and-Tell and Microsoft's UNITER. Here we provide a search tool to find images on this dataset. Often researchers want to test their systems with specific images, this tools allows searching for some that match human-written text descriptions. If you're interested in dowloading this whole dataset go here instead.

The SBU Captions Dataset contains 1 million images with captions and was obtained from Flickr circa 2011 as documented in Ordonez, Kulkarni, and Berg. NeurIPS 2011. If you're interested in dowloading this whole dataset go here instead.

Try entering queries such as ``a person holding a cat'', or ''a bird on top of a boat''.
the cat under the tree

the cat under the tree

cat under the tree

cat under the tree

cat under tree

cat under tree

cat under the tree

cat under the tree

under the cat tree

under the cat tree

cat under tree

cat under tree

cat under tree

cat under tree

cat under tree

cat under tree

cat under the tree

cat under the tree

cat under tree

cat under tree

under the cat tree

under the cat tree

A cat under the tree

A cat under the tree

A cat under a tree

A cat under a tree

cat under tree

cat under tree

cat under a tree

cat under a tree

cat under a tree

cat under a tree

Cat under bird under tree

Cat under bird under tree

cat under tree IMG_0564

cat under tree IMG_0564

cat under ginkgo tree

cat under ginkgo tree

Our cat under the tree

Our cat under the tree