I saw popular datasets like UCF 101 and it has separate labeled video clips of each activity. But in my case I have multiple mouse activity in single video. So do I need to cut one video for each activity of mouse(base d on time) then label them and then train the model ? Also which framework is best suited for this - Keras ,OpenCV etc ?
Only related coding example I found was this - https://hackernoon.com/five-video-classification-methods-implemented-in-keras-and-tensorflow-99cad29cc0b5