3D-video surveillance: tracking people on busy scenes using multiple cameras |
|
Please, note that the article is automatically translated from Russian into English, so the translation may not be accurate. Nikolai Ptitsyn Automatic recognition of objects and situations according to the streaming video systems, security video surveillance is an important scientific and engineering challenge. Combining video analytics technology and three-dimensional modeling allows you to more accurately analyze the behavior of people during the busy scenes where fail-Video standard, operating in the space of 2D or 2.5D. Why do I need a 3D-surveillance?We have already touched on the theme for multi-tracking in the article "The future of video surveillance: multiple camera tracking", where we analyzed the technical approaches to automatically detect the trajectory of people in the three-dimensional space, such as in high-rise building. A key feature of this system is the ability to "pass" followed by an object from one camera to another. Today, we consider the construction of cross-video surveiilance for accurate tracking of people in a busy scene, such as a hall terminal or shopping mall. The problem of calculating and analyzing the individual trajectories of human movement occurs in many areas of city life: preventing and investigating criminal acts, traffic management, marketing, retail sales and tagging advertising on the numeric keypad (see the table).
Application of video analytical algorithms in public places is complicated significantly the density of people, their mutual overlapping and complex geometry of the space (Fig. 1). Compared with the tasks of monitoring the perimeter or entrances of residential buildings, automated control of public places suggests a more intellectual level of video analytical system, allowing effectively to provide information from a large data flow from one side, and supplement missing data - on the other.
Fig. 1. The system of cameras in the arrivals hall of London Gatwick Airport. Camera 2, 3 and 4 overlap in areas shaded 1 Algorithms that handle the flow of only one camera in the space of 2D (see the article "Video motion detectors" ) or 2.5D (see the article "2.5D space: restoring image depth parameters from a single camera"), can not cope in a group of people and the more in heavy traffic. Fig. Figure 2 shows an example of the tracking system 2.5D, where people in the group are recognized as a single entity. The desired result - the individual trajectory for each person - is presented in Fig. 3. A major shortcoming of systems 2.5D is a significant error in measuring the "depth" scene and the real size of the observed targets (Fig. 4).
Operating principal
Deployment of cameras cross-examination (Fig. 5) allows a substantial degree solve the problem of insufficient information and ambiguity of 2D-video. Firstly, the multi-chamber system increases the likelihood of a successful segmentation and object tracking in the field of view of any one chamber due to a larger number of angles of observation. Secondly, the system can calculate with great precision and depth of the real dimensions of objects. Due to large centralized processing-intensive tasks on the video server is not a promising architectural solution.Require decentralized processing scheme stream video on a built computer directly into the camera or encoder to compress the image. In this task the server is in the management computer and to share information between them. Important task is to synchronize the time of handling complaints and support relevant spatial calibration of all calculators. Calculators, embedded camera should be exchanged between a 2D-coordinates and attributes tracked objects at high speed. Delay the transfer of these data should not exceed the processing time of several frames. In this case, for a dense flow of people is important to capture and process the color image with high frame rate. Thus, it is the relevance of the direct data exchange between the chambers. The topology of the P2P-interaction is determined by three-dimensional model of placing cameras at a controlled site. Consider a possible sequence of algorithmic operations on the system for multi-tracking:
The use of cameras of high definition (HD) can reduce their number, but it significantly increases the load evaluator in each chamber. Sometimes justified to use a larger number of cameras and / or high frame rate at a lower resolution. For the detection of people at different distances, as well as to optimize the performance of the application of appropriate methods of multiscale analysis at all stages of image processing. Actual tasks of the 3D-video developersMotion detection technology with the help of TV cameras have been used successfully in related areas. Thus, in the film industry and computer games believable character animation is obtained by recording the movement of a living actor in the studio (Fig. 6).
Fig. 6. Motion capture actor using a system of cameras and fluorescent markers In this case, the introduction of 3D-video technology in the field of security requires considerable adaptation of mathematical algorithms, software and hardware. Security applications impose more stringent requirements in terms of fault-tolerance, precision tracking, cost and scalability. It is these problems now decide the organization leading the development of 3D-video.
1 were used the materials from the site of the Scientific Division of the Ministry of Internal Affairs of the UK, section Imagery library for Intelligent Detection Systems (I-LIDS), http://scienceandresearch.homeoffice.gov.uk/hosdb/cctv-imaging-technology/video-based-detection -systems/i-lids / |








