Large-Scale Intelligent Video Surveillance
A key step towards automating the monitoring of video from many cameras is to generate an understanding of the paths which targets may take between their fields of view. This activity topology information is the foundation for many fundamental tasks in networked surveillance, such as tracking an object across the network. Although it could be derived manually for small sets of cameras, this approach does not scale to large networks, where cameras may frequently be added, malfunction or be moved.
The aim not only to estimate relative positions of surveillance cameras with overlapping fields of view, but also to determine an approximate distance between non-overlapping pairs of cameras. This distance is measured in terms of time taken for a target to transit between a pair of cameras. Cameras whose fields of view overlap have zero transit time, whereas those that are at opposite ends of a corridor may have a transit time of tens of seconds. In line with this goal, we use a novel representation for activity topology estimation that is not based on tracking objects within each camera. Instead it relies on information that is easier to derive—the presence or absence of objects within each field of view. This information is adequate for topology determination, and is fast to work with, enabling the method to scale to large camera networks.
The present invention represents a novel method for estimating the activity topology of a set of cameras. This activity topology represents the critical aspects of the layout of fields of view of the cameras. The method is computationally fast, and does not rely on the tracking of objects within each camera view. In contrast to most existing methods for activity topology determination, it does not attempt to build up evidence for camera proximity over time. Instead, it uses observed activity to rule out topologies over time.
The proliferation of surveillance cameras throughout public places has far outpaced the development of software to monitor the video they generate. Human interrogation of such vast volumes of video is infeasible. This has meant that although networks of cameras have been installed to monitor large facilities, their effectiveness is limited by a lack of the means by which to deal with the volumes of video generated. Dealing effectively with this volume of data requires the development of an understanding of the relationship between the fields of view of the cameras. Techniques exist to extract information which might be used for this purpose, but they do not scale to the extent required by current hardware installations. Large-scale video surveillance systems are already being installed and can be expected to become far more prevalent. A method for effectively analyzing the connections between cameras and thus driving higher-level video interpretation are crucial to their effective exploitation
Technical Benefits and Impact
The primary technical benefit of the invention is the ability to quickly and efficiently determine the activity topology of a large network of video cameras. The determined activity topology represents
- the relationships between the fields of view of a set of cameras
- the routes taken by targets both within, and between, fields of view.
This has the following effects:
- The results of the method are crucial to higher-level processing of activity within the network such as analysis of target behaviour
- The efficiency of the method allows it to be applied to networks of thousands of cameras, rather than current methods which scale to tens of cameras. This allows effective application of video surveillance to large infrastructure, and to dispersed assets.
- The efficiency also means that it can be used in situations where cameras are frequently moved, such as in public transport, systems with pan-tilt-zoom cameras, vehicle-mounted cameras (such as those used on police cars) etc.
The exclusion approach
We have invented a method called exclusion for determining camera overlap that is designed to quickly home in on cameras that may overlap. The method is computationally fast, and does not rely on accurate tracking of objects within each camera view. In contrast to most existing methods, it does not attempt to build up evidence for camera overlap over time. Instead, it starts by assuming all cameras are connected and uses observed activity to rule out connections over time. This is an easier decision to make, especially when a limited amount of data is available. It is also based on the observation that it is impossible to prove a positive connection between cameras\x97any correlation of events could be coincidence\x97whereas it is possible to prove a negative connection by observing an object in one camera while not observing it at all in another.
Consider a set of c cameras that generates c images at time t. By applying foreground detection to all images we obtain a set of foreground blobs, each of which can be summarised by an image position and camera index. Each image is partitioned into a grid of windows, and each window can be labelled \x94occupied\x94 or \x94unoccupied\x94 depending on whether it contains a foreground object. Exclusion is based on the observation that a window which is occupied at time t cannot be an image of the same area as any other window that is simultaneously unoccupied. Given that windows tend to be unoccupied more often than they are occupied, this observation can be used to eliminate a large number of window pairs as potentially viewing the same area. The process of elimination can be repeated for each frame of video to rapidly reduce the number of pairs of image windows that could possibly be connected. This is the opposite of most previous approaches: rather than accumulate positive information over time about links between windows, we seek negative information allowing the instant elimination of impossible connections. Such connections are referred to as having been excluded.
The exclusion approach to topology estimation has been licenced to SNAP Network Surveillance for commercialisation, but the work in this area continues within the ACVT. Current projects are looking at improving the tracking of targets between cameras, and improving the topology estimation process.