|
Large-Scale Video Surveillance
Large-Scale Video Surveillance
A key step towards automating the monitoring of video from many cameras is to generate an understanding of the paths which targets may take between their fields of view. This activity topology information is the foundation for many fundamental tasks in networked surveillance, such as tracking an object across the network. Although it could be derived manually for small sets of cameras, this approach does not scale to large networks, where cameras may frequently be added, malfunction or be moved.
The aim not only to estimate relative positions of surveillance cameras with overlapping fields of view, but also to determine an approximate distance between non-overlapping pairs of cameras. This distance is measured in terms of time taken for a target to transit between a pair of cameras. Cameras whose fields of view overlap have zero transit time, whereas those that are at opposite ends of a corridor may have a transit time of tens of seconds. In line with this goal, we use a novel representation for activity topology estimation that is not based on tracking objects within each camera. Instead it relies on information that is easier to derive—the presence or absence of objects within each field of view. This information is adequate for topology determination, and is fast to work with, enabling the method to scale to large camera networks.

A diagram of the topology of a small camera network.
The technology
The present invention represents a novel method for estimating the activity topology of a set of cameras. This activity topology represents the critical aspects of the layout of fields of view of the cameras. The method is computationally fast, and does not rely on the tracking of objects within each camera view. In contrast to most existing methods for activity topology determination, it does not attempt to build up evidence for camera proximity over time. Instead, it uses observed activity to rule out topologies over time.
Economic Benefit
The proliferation of surveillance cameras throughout public places has far outpaced the development of software to monitor the video they generate. Human interrogation of such vast volumes of video is infeasible. This has meant that although networks of cameras have been installed to monitor large facilities, their effectiveness is limited by a lack of the means by which to deal with the volumes of video generated. Dealing effectively with this volume of data requires the development of an understanding of the relationship between the fields of view of the cameras. Techniques exist to extract information which might be used for this purpose, but they do not scale to the extent required by current hardware installations. Large-scale video surveillance systems are already being installed and can be expected to become far more prevalent. A method for effectively analyzing the connections between cameras and thus driving higher-level video interpretation are crucial to their effective exploitation
Technical Benefits and Impact
The primary technical benefit of the invention is the ability to quickly and efficiently determine the activity topology of a large network of video cameras. The determined activity topology represents
- the relationships between the fields of view of a set of cameras
- the routes taken by targets both within, and between, fields of view.
This has the following effects:
- The results of the method are crucial to higher-level processing of activity within the network such as analysis of target behaviour
- The efficiency of the method allows it to be applied to networks of thousands of cameras, rather than current methods which scale to tens of cameras. This allows effective application of video surveillance to large infrastructure, and to dispersed assets.
- The efficiency also means that it can be used in situations where cameras are frequently moved, such as in public transport, systems with pan-tilt-zoom cameras, vehicle-mounted cameras (such as those used on police cars) etc.
The exclusion approach
We have invented a method called exclusion for
determining camera overlap that is designed to quickly home in on cameras
that may overlap. The method is computationally fast, and does not rely on
accurate tracking of objects within each camera view. In contrast to most existing
methods, it does not attempt to build up evidence for camera overlap
over time. Instead, it starts by assuming all cameras are connected and uses
observed activity to rule out connections over time. This is an easier decision
to make, especially when a limited amount of data is available. It is also based
on the observation that it is impossible to prove a positive connection between
cameras\x97any correlation of events could be coincidence\x97whereas it is possible
to prove a negative connection by observing an object in one camera while not
observing it at all in another.
Consider a set of c cameras that generates c images at time t. By applying
foreground detection to all images we obtain a set of foreground blobs, each
of which can be summarised by an image position and camera index. Each image
is partitioned into a grid of windows, and each window can be labelled \x94occupied\x94
or \x94unoccupied\x94 depending on whether it contains a foreground object.
Exclusion is based on the observation that a window which is occupied at
time t cannot be an image of the same area as any other window that is simultaneously
unoccupied. Given that windows tend to be unoccupied more often than
they are occupied, this observation can be used to eliminate a large number
of window pairs as potentially viewing the same area. The process of elimination
can be repeated for each frame of video to rapidly reduce the number of
pairs of image windows that could possibly be connected. This is the opposite
of most previous approaches: rather than accumulate positive information over
time about links between windows, we seek negative information allowing the
instant elimination of impossible connections. Such connections are referred to
as having been excluded.
Activity Topology-Based Tracking
Please click the links below to view videos showing various aspects of the
technology in use.
-
- This video illustrates manual tracking of a target by clicking on adjacent
cameras displayed on the left and right of the current camera. The
adjacent cameras are determined by the activity topology. The video may be
paused, rewound and played back at any speed during manual tracking.
-
- The activity topology not only gives information on which cameras are
connected, but which areas of which cameras are connected. This video
demonstrates how a visualisation of this information can be used to further
facilitate manual tracking. Areas of the current camera connected to other
cameras are shaded, and cameras connected to the area under the cursor are
highlighted. A shaded area may be clicked to browse to the associated
camera in cases where there is no ambiguity. This can be particularly useful
in areas with many overlapping cameras or confusing arrangements of cameras.
-
- This video is similar to the previous example, but demonstrates how
activity topology-assisted manual tracking is effective even in crowded
scenes, where automatic tracking is not feasible with currently available
techniques.
-
- Activity topology can also support automatic tracking techniques. This video
shows unaided automatic tracking of a target moving through over ten cameras.
-
- The footage displayed during both manual and automatic tracking can be saved
for the purpose of producing a summary of evidence. This video illustrates
saving footage from manual tracking and playing the recorded footage back
in a standard video player (Windows Media Player).
-
- An additional example of automatic tracking. Should the tracking heuristic
select an incorrect target, the correct target may easily be clicked to resolve
the problem and continue with automatic tracking. Note also that the user may
choose to switch to manual tracking at any time if desired.
-
- Unlike some other techniques available, the presented tracking technique
operates equally well for video being played back in reverse. This enables
the path of a target to be traced back from the location of an incident, as
demonstrated in this video.
-
- This video shows the activity topology of a sample surveillance network of 26
cameras being automatically discovered and refined based on normal activity in
the area over the course of a four-hour period.
-
- This video shows the activity topology discovered for a larger surveillance
network of over 170 cameras being automatically divided into connected groups
of cameras and laid out for easy viewing.
More information
The method is covered by a patent application in the US and Australia. A number of papers have been published, including the following
|