SCoRD: Visual Relation Predictor

Our group has produced an auto-regressive model for the purpose of enhanced Scene Graph Generation (SGG). Here we leverage our WACV 2024 paper: SCoRD: Subject-Conditional Relation Detection With Text-Augmented Data. In this paper we proposed a new approach called Subject-Conditional Relation Detection (SCoRD) aimed at predicting all relations of a given subject within a scene, including the locations of these relations. Leveraging the Open Images dataset, we presented the OIv6-SCoRD benchmark to challenge existing models with a shift in the distribution of subject-relation-object triplets between training and testing phases. The novel contribution includes an auto-regressive model that, given a subject, predicts its relations to other objects and their spatial locations as a sequence of tokens. Try uploading an image below to see the SCoRD model in action.

Here we leverage our WACV 2024 paper: SCoRD: Subject-Conditional Relation Detection With Text-Augmented Data to generate token sequences relating all objects in an images to a given subject.

Subject:
Name the main object here and draw a box below.
Loading...
Input Image
Annotated Output Image

Predicting for: players

  • from teams
    (0, 0, 1192, 1165)
  • front of crowd
    (0, 0, 1192, 1165)
  • dribbling ball
    (328, 306, 454, 446)
Demo by Micah
Gallery of examples
Technical Notes: This demo is running in a CPU-only instance. The images uploaded through this demo are not stored in our server or anywhere (not even temporarily), we only hold images on the server-side in volatile memory while they are being processed and return the resulting image as base64 encoded strings directly to the user's browser. Any images presented here as examples were not uploaded by a user but were images directly uploaded by our team. This is not a demo that aims to collect any data from users.