DECONSTRUCTING BINARY CLASSIFIERS IN COMPUTER VISION

This paper develops the novel notion of deconstructive learning and proposes a practical model for deconstructing a broad class of binary classifiers commonly used in vision applications. Specifically, the problem studied in this paper is: Given an image-based binary classifier CC as a black-box oracle, how much can we learn of its internal working by simply querying it? In particular, we demonstrate that it is possible to ascertain the type of kernel function used by…

Continue reading
Continue reading...

MORE FOR LESS: INSIGHTS INTO CONVOLUTIONAL NETS FOR 3D POINT CLOUD RECOGNITION

With the recent breakthrough in commodity 3D imaging solutions such as depth sensing, photogrammetry, stereoscopic vision and structured light, 3D shape recognition is becoming an increasingly important problem. A longstanding question is what should be the format of the 3D shape (such as voxel, mesh, point-cloud etc.) and what could be a good generic feature representation for shape recognition. This question is particularly important in the context of convolutional neural network (CNN) whose efficacy and…

Continue reading
Continue reading...

MLSL: MULTI-LEVEL SELF-SUPERVISED LEARNING FOR DOMAIN ADAPTATION WITH SPATIALLY INDEPENDENT AND SEMANTICALLY CONSISTENT LABELING

Most of the recent Deep Semantic Segmentation algorithms suffer from large generalization errors, even when powerful hierarchical representation models based on convolutional neural networks have been employed. This could be attributed to limited training data and large distribution gap in train and test domain datasets.In this paper, we propose a multi-level self-supervised learning model for domain adaptation of semantic segmentation.Exploiting the idea that an object (and most of the stuff given context) should be labeled…

Continue reading
Continue reading...

TWIN-NET DESCRIPTOR: TWIN NEGATIVE MINING WITH QUAD LOSS FOR PATCH BASED MATCHING

Local keypoint matching is an important step for computer vision based tasks. In recent years, Deep Convolutional Neural Network (CNN) based strategies have been employed to learn descriptor generation to enhance keypoint matching accuracy.  Recent state-of-art works in this direction primarily rely upon a triplet based loss function (and its variations) utilizing three samples: an anchor, a positive and a negative. In this work we propose a novel “Twin Negative Mining” based sampling strategy coupled with…

Continue reading
Continue reading...

MOVING OBJECT DETECTION IN COMPLEX SCENE USIN SPATIOTEMPORAL STRUCTURED-SPARSE RPCA

Moving object detection is a fundamental step in various computer vision applications. Robust principal component analysis (RPCA)-based methods have often been employed for this task. However, the performance of these methods deteriorates in the presence of dynamic background scenes, camera jitter, camouflaged moving objects, and/or variations in illumination. It is because of an underlying assumption that the elements in the sparse component are mutually independent, and thus the spatiotemporal structure of the moving objects is…

Continue reading
Continue reading...

EMOTIONAL FILTERS: AUTOMATIC IMAGE TRANSFORMATION FOR INDUCING AFFECT

Current image transformation and recoloring algorithms try to introduce artistic effect in the photographed images, based on users input of target image(s) or selection of pre-designed filters. In this paper we present an automatic image-transformation method that transforms the source image such that it induces an emotional affect on the viewer, as desired by the user. Our method can handle much more diverse set of images than previous methods. A discussion and reasoning of failure…

Continue reading
Continue reading...

USING SATELLITE IMAGERY FOR GOOD: DETECTING COMMUNITIES IN DESERT AND MAPPING VACCINATION ACTIVITIES

Deep convolutional neural networks (CNNs) have outperformed existing object recognition and detection algorithms. This paper describes a deep learning approach that analyzes a geo-referenced satellite image and efficiently detects built structures in it. A Fully Convolutional Network (FCN) is trained on low-resolution Google earth satellite imagery in order to achieve the end result. The detected built communities are then correlated with the vaccination activity. ANZA SHAKEEL, MOHSEN ALI ARXIV 2017 Show More PDF

Continue reading
Continue reading...

HIGH-LEVEL CONCEPTS FOR AFFECTIVE UNDERSTANDING OF IMAGES

This paper aims to bridge the affective gap between image content and the emotional response of the viewer, it elicits, by using High-Level Concepts (HLCs). In contrast to previous work that relied solely on low-level features or used convolutional neural network (CNN) as a blackbox, we use HLCs generated by pre-trained CNNs in an explicit way to investigate the relations/associations between these HLCs and a(small)set of Ekman’s emotional classes. Experimental results have demonstrated that our…

Continue reading
Continue reading...

LOCALIZING FIREARM CARRIERS BY IDENTIFYING HUMAN-OBJECT PAIRS

Gunmen in a crowd is a challenging problem, that requires resolving the association of a person with an object (firearm). We present a novel approach to address this problem, by defining human-object interaction (and non-interaction) bounding boxes. In a given image, human and firearms are separately detected. Each detected human is paired with each detected firearm, allowing us to create a paired bounding box that contains both object and the human.A network is trained to…

Continue reading
Continue reading...

EXPLOITING GEOMETRIC CONSTRAINTS ON DENSE TRAJECTORIES FOR MOTION SALIENCY

The existing approaches for salient motion segmentation are unable to explicitly learn geometric cues and often give false detections on prominent static objects. We exploit multiview geometric constraints to avoid such mistakes. To handle nonrigid background like sea, we also propose a robust fusion mechanism between motion and appearance-based features. We find dense trajectories, covering every pixel in the video, and propose trajectory-based epipolar distances to distinguish between background and foreground regions. Trajectory epipolar distances…

Continue reading