Artículo

Estamos trabajando para incorporar este artículo al repositorio
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor

Abstract:

Multiclass action detection in complex scenes is a challenging problem because of cluttered backgrounds and the large intra-class variations in each type of actions. To achieve efficient and robust action detection, we characterize a video as a collection of spatio-temporal interest points, and locate actions via finding spatio-temporal video subvolumes of the highest mutual information score towards each action class. A random forest is constructed to efficiently generate discriminative votes from individual interest points, and a fast top-K subvolume search algorithm is developed to find all action instances in a single round of search. Without significantly degrading the performance, such a top-K search can be performed on down-sampled score volumes for more efficient localization. Experiments on a challenging MSR Action Dataset II validate the effectiveness of our proposed multiclass action detection method. The detection speed is several orders of magnitude faster than existing methods. © 2011 IEEE.

Registro:

Documento: Artículo
Título:Fast action detection via discriminative random forest voting and top-K subvolume search
Autor:Yu, G.; Goussies, N.A.; Yuan, J.; Liu, Z.
Filiación:School of Electrical and Electronic Engineering, Nanyang Technological University, 639798 Singapore, Singapore
Universidad de Buenos Aires, C1428EGA Buenos, Aires, Argentina
Microsoft Research, Redmond, WA 98052-6399, United States
Palabras clave:Action detection; branch and bound; random forest; top-K search; Action detection; branch and bound; Complex scenes; Data sets; Detection methods; Interest points; Intra-class variation; Multi-class; Mutual informations; Orders of magnitude; random forest; Random forests; Search Algorithms; Spatio-temporal; Subvolumes; top-K search; Linear programming; Decision trees
Año:2011
Volumen:13
Número:3
Página de inicio:507
Página de fin:517
DOI: http://dx.doi.org/10.1109/TMM.2011.2128301
Título revista:IEEE Transactions on Multimedia
Título revista abreviado:IEEE Trans Multimedia
ISSN:15209210
CODEN:ITMUF
Registro:https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_15209210_v13_n3_p507_Yu

Referencias:

  • Laptev, I., On space-time interest points (2005) International Journal of Computer Vision, 64 (2-3), pp. 107-123. , DOI 10.1007/s11263-005-1838-7
  • Brodal, G., Jørgensen, A., A linear time algorithmfor the kmaximal sums problem (2007) Math. Foundations Comput. Sci., pp. 442-453
  • Schuldt, C., Laptev, I., Caputo, B., Recognizing human actions: A local SVM approach (2004) Proc. IEEE Conf. Pattern Recognit.
  • Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B., Learning realistic human actions from movies (2008) Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  • Gall, J., Lempitsky, V., Class-specific Hough forests for object detection (2009) Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  • Reddy, K.K., Liu, J., Shah, M., Incremental action recognition using feature-tree (2009) Proc. IEEE Int. Conf. Comput. Vis.
  • Shechtman, E., Irani, M., Space-time behavior based correlation (2005) Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  • Ke, Y., Sukthankar, R., Hebert, M., Event detection in crowded videos (2007) Proc. IEEE Int. Conf. Comput. Vis.
  • Yuan, J., Liu, Z., Wu, Y., Discriminative subvolume search for efficient action detection (2009) Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  • Hu, Y., Cao, L., Lv, F., Yan, S., Gong, Y., Huang, T.S., Action detection in complex scenes with spatial and temporal ambiguities (2009) Proc. IEEE Int. Conf. Comput. Vis.
  • Bobick, A.F., Davis, J.W., The recognition of human movement using temporal templates (2001) IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 (3), pp. 257-267. , DOI 10.1109/34.910878
  • Yuan, J., Liu, Z., Wu, Y., Zhang, Z., Speeding up spatio-temporal sliding-window search for efficient event detection in crowded videos (2009) Proc. ACM Multimedia Workshop on Events in Multimedia
  • Rodriguez, M.D., Ahmed, J., Shah, M., Action mach a spatio-temporal maximum average correlation height filter for action recognition (2008) Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  • Boyer, E., Weinland, D., Ronfard, R., Free viewpoint action recognition using motion history volumes (2006) Comput. Vis. Image Understanding, 104 (2-3), pp. 207-229
  • Ke, Y., Sukthankar, R., Hebert, M., Efficient visual event detection using volumetric features (2005) Proc. IEEE Int. Conf. Comput. Vis.
  • Yang, M., Lv, F., Xu, W., Yu, K., Gong, Y., Human action detection by boosting efficient motion features (2009) Proc. IEEE Workshop Videooriented Object and Event Classification in Conjunction With ICCV, , Kyoto, Japan, Sep. 29-Oct. 2
  • Lin, Z., Jiang, Z., Davis, L.S., Recognizing actions by shape-motion prototype trees (2009) Proc. IEEE Intl. Conf. Comput. Vis.
  • Jiang, H., Drew, M.S., Li, Z.N., Action detection in cluttered video with successive convex matching (2010) IEEE Trans. Circuits Syst. Video Technol., 20 (1), pp. 50-64. , Jan
  • Cao, L., Liu, Z., Huang, T.S., Cross-dataset action recognition (2010) Proc. IEEE Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Norbert, A., Liu, Z., Yuan, J., Efficient search of top-K video subvolumes formulti-instance action detection (2010) Proc. IEEE Conf.Multimedia Expo (ICME)
  • Derpanis, K.G., Sizintsev, M., Cannons, K., Wildes, R.P., Efficient action spotting based on a spacetime oriented structure representation (2010) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Lampert, C.H., Blaschko, M.B., Hofmann, T., Efficient subwindow search: A branch and bound framework for object localization (2009) IEEE Trans. Pattern Anal. Mach. Intell., 31 (12), pp. 2129-2142. , Dec
  • Lampert, C.H., Detecting objects in large image collections and videos by efficient subimage retrieval (2009) Proc. IEEE Int. Conf. Comput. Vis.
  • Breiman, L., Random forests (2001) Machine Learning, 45 (1), pp. 5-32. , DOI 10.1023/A:1010933404324
  • Bosch, A., Zisserman, A., Munoz, X., Image classification using random forests and ferns (2007) Proc. IEEE Int. Conf. Comput. Vis.
  • Lepetit, V., Lagger, P., Fua, P., Randomized trees for real-time keypoint recognition (2005) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Schroff, F., Criminisi, A., Zisserman, A., Object class segmentation using random forests (2008) Proc. Brit. Mach. Vis. Conf.
  • Wang, P., Abowd, G.D., Rehg, J.M., Quasi-periodic event analysis for social game retrieval (2009) Proc. IEEE Int. Conf. Comput. Vis.
  • Prabhakar, K., Oh, S., Wang, P., Abowd, G.D., Rehg, J.M., Temporal causality for the analysis of visual events (2010) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Liu, J., Luo, J., Shah, M., Recognizing realistic actions from videos "in the wild" (2009) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Messing, R., Pal, C., Kautz, H., Activity recognition using the velocity histories of tracked keypoints (2009) Proc. IEEE Int. Conf. Comput. Vis.
  • Duan, L., Xu, D., Tsang, I.W., Luo, J., Visual event recognition in videos by learning from web data (2010) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Kovashka, A., Grauman, K., Learning a hierarchy of discriminative space-time neighborhood features for human action recognition (2010) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Niebles, J., Chen, C.W., Li, F.-F., Modeling temporal structure of decomposable motion segments for activity classification (2010) Proc. Eur. Conf. Comput. Vis. (ECCV)
  • Seo, H.J., Milanfar, P., Detection of human actions from a single example (2009) Proc. IEEE Int. Conf. Comput. Vis. (ICCV)
  • Mikolajczyk, K., Uemura, H., Action recognition with motion-appearance vocabulary forest (2008) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Yacoob, Y., Black, M.J., Parameterized modeling and recognition of activities (1999) Proc. Comput. Vis. Image Understanding Conf., 73, pp. 232-247
  • Ramanan, D., Forsyth, D.A., Automatic annotation of everyday movements (2003) Proc. Neural Inf. Process. Syst. Conf.
  • Kim, T.K., Cipolla, R., Canonical correlation analysis of video volume tensors for action categorization and detection (2008) IEEE Trans. Pattern Anal. Mach. Intell., 30 (8), pp. 1415-1428. , Aug
  • Cao, L., Tian, Y.L., Liu, Z., Yao, B., Zhang, Z., Huang, T.S., Action detection using multiple spatio-temporal interest point features (2010) Proc. IEEE Conf. Multimedia Expo
  • Seo, H.J., Milanfar, P., Action recognition from one example (2010) IEEE Trans. Pattern Anal. Mach. Intell., 32 (5), pp. 867-882. , May
  • Li, Z., Fu, Y., Yan, S., Huang, T.S., Real-time human action recognition by luminance field trajectory analysis (2008) Proc. ACM Int. Conf. Multimedia
  • Zhu, G., Yang, M., Yu, K., Xu, W., Gong, Y., Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor (2009) Proc. ACM Int. Conf. Multimedia, pp. 165-174. , Oct. 19-24
  • Breitenbach, M., Nielsen, R., Grudic, G.Z., (2003) Probabilistic Random Forests: Predicting Data Point Specific Misclassification Probabilities, , Univ. of Colorado at Boulder, Tech. Rep. CU-CS- 954-03
  • http://en.wikipedia.org/wiki/Pointwise_mutual_information, [Online]. Available:; Yao, A., Gall, J., Van Gool, L., A hough transform-based voting framework for action recognition (2010) Proc. Comput. Vis. Pattern Recognit. (CVPR)
  • Yu, T.H., Kim, T.K., Cipolla, R., Real-time action recognition by spatiotemporal semantic and structural forest (2010) Proc. BMVC
  • Yuan, J., Liu, Z., Wu, Y., Discriminative video pattern search for efficient action detection IEEE Trans. Pattern Anal. Mach. Intell
  • Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C., Evaluation of local spatio-temporal features for action recognition (2009) Proc. Brit. Mach. Vis. Conf.

Citas:

---------- APA ----------
Yu, G., Goussies, N.A., Yuan, J. & Liu, Z. (2011) . Fast action detection via discriminative random forest voting and top-K subvolume search. IEEE Transactions on Multimedia, 13(3), 507-517.
http://dx.doi.org/10.1109/TMM.2011.2128301
---------- CHICAGO ----------
Yu, G., Goussies, N.A., Yuan, J., Liu, Z. "Fast action detection via discriminative random forest voting and top-K subvolume search" . IEEE Transactions on Multimedia 13, no. 3 (2011) : 507-517.
http://dx.doi.org/10.1109/TMM.2011.2128301
---------- MLA ----------
Yu, G., Goussies, N.A., Yuan, J., Liu, Z. "Fast action detection via discriminative random forest voting and top-K subvolume search" . IEEE Transactions on Multimedia, vol. 13, no. 3, 2011, pp. 507-517.
http://dx.doi.org/10.1109/TMM.2011.2128301
---------- VANCOUVER ----------
Yu, G., Goussies, N.A., Yuan, J., Liu, Z. Fast action detection via discriminative random forest voting and top-K subvolume search. IEEE Trans Multimedia. 2011;13(3):507-517.
http://dx.doi.org/10.1109/TMM.2011.2128301