Abstract:
Video classification has been extensively researched in computer vision due to its wide spread use in many important applications such as human action recognition and dynamic scene classification. It is highly desired to have an end-to-end learning framework that can establish effective video representations while simultaneously conducting efficient video classification.
The convolution 3-D (C3-D) and VGG (vision and graphics group) are first deployed to extract temporal and spatial features from the input videos cooperatively, which establishes comprehensive and informative representations of videos. VGG and C3-D are chosen due to their strong capability of extracting complementary spatial and temporal features for comprehensive video representations. The introduced RLE layer is further deployed to encode the initial outputs of classifiers, which is followed by a weighting layer jointly learned in the end-to-end framework to combine classification results. In this proposed system, we propose the Deep belief network method for action recognition in video.
The input video is converted into number of frames. The converted frames will be stored in the database. Then the ROI (Region of interest) segmentation algorithm is used in order to detect the particular part of the frame. Finally the deep belief network classifier is used to detect the given input video is abnormal or not.
Proposed System:
In this proposed system, we propose the Deep belief network method for action recognition in video. The input video is converted into number of frames. The converted frames will be stored in the database. The frame is feed into the feature extraction (in order to extract the feature from the video frames). Then the ROI (Region of interest) segmentation algorithm is used in order to detect the particular part of the frame. Finally the deep belief network classifier is used to detect the given input video is abnormal or not.
Reviews
There are no reviews yet.