Video Classification Using Deep Learning

This project involved classifying short videos into five categories using a dataset provided by the University of Central Florida. I developed several helper functions to prepare and analyze the data, using OpenCV for video loading and manipulation, and creating visualizations such as training/validation accuracy versus epoch and confusion matrices.

I first attempted a hybrid CNN-RNN architecture, using the pre-trained InceptionV3 model for the CNN component and training the RNN on the CNN output. Despite tuning efforts, this model's performance on the test data was unsatisfactory, with a weighted average precision of 0.32, recall of 0.5, and an F1 score of 0.38.

I then utilized the MobileNetV3 Small model to classify the videos frame by frame, which greatly improved the classification performance to a weighted average precision, recall, and F1-score of 0.94 each. This model was successful in identifying key objects within frames, aiding accurate classification.

Despite the success of frame-by-frame classification in this project, I discussed the potential necessity of sequence models for video classification in instances where the videos have similar backgrounds and objects, using the example of surgical videos.

This project displayed my video classification skills, particularly in comparing and leveraging different deep learning architectures for optimal results. It also showcased my problem-solving ability to adapt and experiment with various techniques according to the dataset's characteristics. All work was performed using Python and its deep learning libraries.

Please check out my project on GitHub.

Previous
Previous

Soft Robotic Inchworm

Next
Next

Malaria Detection