Relational data : Postgres, MYSQL, SQLite

::

Text, HTML : Lucene/Solr, Elasticsearch

::

Videos & Images : Deep Video Analytics

Upload

Upload videos or set of images. Download Youtube urls automatically. Browse & annotate uploaded videos. Ability to import pre-indexed datasets.

Process

Perform scene detection, frame extraction on videos. Annotate frames, detections with bounding boxes, labels and metadata.

Search

Extracted objects, along with entire frames and crops, are indexed using deep features. Feature vectors are used for visual search retrieval.

Deploy

Deploy on variety of machines with/without GPUs, local & cloud. Docker compose enables automated setup of Postgres & RabbitMQ.

Features & Models

We take significant efforts to ensure that following models (code+weights included) work without having to write any code.

Features

  • Visual Search as a primary interface

  • Upload videos, multiple images.

  • Provide Youtube url to be automatically downloaded.

  • Pre-trained recognition/detection, face recognition models.

  • Metadata stored in Postgres, all operations performed asynchronously.

  • Celery allows video & query flows to be easily modified.

  • Videos, frames, indexes, etc. stored in media directory, served through nginx.

  • Manually run code & tasks without UI using a Jupyter notebook.

  • Customize by specifying environment variables

Models

Import external datasets using VDN

  • MSCOCO

  • Labeled Faces in the Wild

Deep Video Analytics + Visual Data Network

Part of Visual Intelligence Platform

Deep Video Analytics

+

Visual Data Network

Seamless integration with Visual Data Network

Quickly import pre-processed datasets

Data & processing model

Installation

Pre-built docker images for both CPU & GPU versions are available on Docker Hub.

Machines without an Nvidia GPU

Deep Video analytics is implemented using Docker and works on Mac, Windows and Linux. Make sure you have latest version of Docker installed.

git clone https://github.com/AKSHAYUBHAT/DeepVideoAnalytics
cd DeepVideoAnalytics/docker && docker-compose up

Machines with Nvidia GPU

You need to have latest version of Docker and nvidia-docker installed. The GPU Dockerfile is slightly different from the CPU version dockerfile.

pip install --upgrade nvidia-docker-compose
git clone https://github.com/AKSHAYUBHAT/DeepVideoAnalytics
cd DeepVideoAnalytics/docker && ./rebuild_gpu.sh
nvidia-docker-compose -f docker-compose-gpu.yml up

On AWS using a P2.xlarge instance

We provide an AMI with all dependencies such as docker & nvidia drivers pre-installed. To use it start a P2.xlarge instance with AMI ID: (Please email us) and ports 8000, 6006, 8888 open (preferably to only your IP). Run following commands after logging into the machine via SSH. After approximately 5 ~ 1 minutes the user interface will appear on port 8000 of the instance ip. AMI creation is documented here.


rm -rf deepvideoanalytics
git clone https://github.com/akshayubhat/deepvideoanalytics
cd deepvideoanalytics/docker
./rebuild_gpu.sh
nvidia-docker-compose -f docker-compose-gpu.yml up

Security warning

We recommend that you allow inbound traffic only from your own IP addresses, you can easily change these using AWS security rules even after instance has started. When deploying/running on remote Ubuntu machines on VPS services such as Linode etc. beware of the Docker/UFW firewall issues. Docker bypasses UFW firewall and opens the port 8000 to internet. You can change the behavior by using a loopback interface (127.0.0.1:8000:80) and then forwarding the port (8000) over SSH tunnel, an example of this is shown here.

Demo & Tutorials

Coming Soon!

Architecture

Docker containers, networking and volumes

Video & Query processing

Documentation & Presentation

Some documentation is available here along with a board for planned future tasks.

For a quick overview of design choices and vision behind this project we strongly recommend going through following presentation.

Paper & Citation

Coming Soon!

References

  1. Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

  2. Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

  3. Zhang, Kaipeng, et al. "Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks." IEEE Signal Processing Letters 23.10 (2016): 1499-1503.

  4. Liu, Wei, et al. "SSD: Single shot multibox detector." European Conference on Computer Vision. Springer International Publishing, 2016.

  5. Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.

  6. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

  7. Johnson, Jeff, Matthijs Douze, and Hervé Jégou. "Billion-scale similarity search with GPUs." arXiv preprint arXiv:1702.08734 (2017).

Issues, Questions & Contact

Please submit all software related bugs and questions using Github issues, for other questions you can contact me at akshayubhat@gmail.com.

© 2017 Akshay Bhat, Cornell University.
All rights reserved.