Popis: |
Visual tracking of unknown objects in unconstrained video-sequences is extremely challenging due to a number of unsolved issues. This thesis explores several of these and examines possible approaches to tackle them. The unconstrained nature of real-world input sequences creates huge variation in the appearance of the target object due to changes in pose and lighting. Additionally, the object can be occluded by either parts of itself, other elements of the scene, or the frame boundaries. Observations may also be corrupted due to low resolution, motion blur, large frame-to-frame displacement, or incorrect exposure or focus of the camera. Finally, some objects are inherently difficult to track due to their (low) texture, specular/transparent nature, non-rigid deformations, etc. Conventional trackers depend heavily on the texture of the target. This causes issues with transparent or untextured objects. Edge points can be used in cases where standard feature points are scarce; these however suffer from the aperture problem. To address this, the first contribution of this thesis explores the idea of virtual corners, using pairs of non-adjacent line correspondences, tangent to edges in the image. Furthermore, the chapter investigates the possibility of long-term tracking, introducing a re-detection scheme to handle occlusions while limiting drift of the object model. The outcome of this research is an edge-based tracker, able to track in scenarios including untextured objects, full occlusions and significant length. The tracker, besides reporting excellent results in standard benchmarks, is demonstrated to successfully track the longest sequence published to date. Some of the issues in visual tracking are caused by suboptimal utilisation of the image information. The object of interest can easily occupy as few as ten or even one percent of the video frame area. This causes difficulties in challenging scenarios such as sudden camera shakes or full occlusions. To improve tracking in such cases, the next major contribution of this thesis explores relationships within the context of visual tracking, with a focus on causality. These include causal links between the tracked object and other elements of the scene such as the camera motion or other objects. Properties of such relationships are identified in a framework based on information theory. The resulting technique can be employed as a causality-based motion model to improve the results of virtually any tracker. Significant effort has previously been devoted to rapid learning of object properties on the fly. However, state-of-the-art approaches still often fail in cases such as rapid out-of-plane rotations, when the appearance changes suddenly. One of the major contributions of this thesis is a radical rethinking of the traditional wisdom of modelling 3D motion as appearance change. Instead, 3D motion is modelled as 3D motion. This intuitive but previously unexplored approach provides new possibilities in visual tracking research. Firstly, 3D tracking is more general, as large out-of-plane motion is often fatal for 2D trackers, but helps 3D trackers to build better models. Secondly, the tracker’s internal model of the object can be used in many different applications and it could even become the main motivation, with tracking supporting reconstruction rather than vice versa. This effectively bridges the gap between visual tracking and Structure from Motion. The proposed method is capable of successfully tracking sequences with extreme out-of-plane rotation, which poses a considerable challenge to 2D trackers. This is done by creating realistic 3D models of the targets, which then aid in tracking. In the majority of the thesis, the assumption is made that the target’s 3D shape is rigid. This is, however, a relatively strong limitation. In the final chapter, tracking and dense modelling of non-rigid targets is explored, demonstrating results in even more generic (and therefore challenging) scenarios. This final advancement truly generalises the tracking problem with support for long-term tracking of low texture and non-rigid objects in sequences with camera shake, shot cuts and significant rotation. Taken together, these contributions address some of the major sources of failure in visual tracking. The presented research advances the field of visual tracking, facilitating tracking in scenarios which were previously infeasible. Excellent results are demonstrated in these challenging scenarios. Finally, this thesis demonstrates that 3D reconstruction and visual tracking can be used together to tackle difficult tasks. |