Show HN: Benchmark Deep CV Training Pipelines in Less Than 3 Minutes

1 point

2 years ago

Hi!

We've built a small open-source library that might be useful to deep learning practitioners around here.

TLDR It's a benchmark tool for computer vision training pipelines that can be run in less than 3 minutes without relying on external datasets and complex setups.

Not TLDR

A few weeks ago we had to buy new training GPUs. After much research, we found our top 2 candidates - A6000 and 6000 Ada.6000 Ada is a newer GPU, roughly 2x the price of the A6000. The million-dollar question was: does it also have 2x performance during training computer vision models?

We searched all over the web, and although we found some benchmarks, we weren't 100% sure in the results. And when you're shilling out $4000+ on a GPU, you better be 100% sure the GPU is worth it :)

The main issues in the found benchmarks were:

    * No measurements for the more modern CV architectures (ViT, SWIN...)

    *  No measurements for the non-classification architectures such as UNet

    * No Mixed-precision measurements

    *  Missing measurements on specific GPUs

    * No details about the measurement process: is data loading part of the benchmark, how many benchmark iterations, did they use warmup steps and how many steps were in warmup

    * What was the computer configuration besides the GPU?

    * How does the performance scale with more GPUs?

    * Complex setup for benchmarks that can be run locally

    * Benchmark repo out of date

    * Messy output formats

    * Unknown software dependencies: CUDA, cuDNN, pytorch...

With that in mind, we decided to create our benchmark tool for CV models that is easy to setup and use. Plus, it's also open source so you can check how it works for yourself!

Key features:

    * Easy to run because everything is dockerized

    * Only measures the pure training loop performance -> no CPU and disk bottlenecks or any other overhead code

    * All software dependencies are known

    * Supports all major training features: mixed-precision, multi-gpu, DDP

    * Supports major CV architectures: from VGG to vision transformers

    * Lots of parameters to configure: batch size, input width and height, precision, number of benchmark iterations, warmup steps...

    * Logging to a CSV file

This is the first release so feedback and pull requests are more than welcome!

* No measurements for the more modern CV architectures (ViT, SWIN...) * No measurements for the non-classification architectures such as UNet * No Mixed-precision measurements * Missing measurements on specific GPUs * No details about the measurement process: is data loading part of the benchmark, how many benchmark iterations, did they use warmup steps and how many steps were in warmup * What was the computer configuration besides the GPU? * How does the performance scale with more GPUs? * Complex setup for benchmarks that can be run locally * Benchmark repo out of date * Messy output formats * Unknown software dependencies: CUDA, cuDNN, pytorch...

* Easy to run because everything is dockerized * Only measures the pure training loop performance -> no CPU and disk bottlenecks or any other overhead code * All software dependencies are known * Supports all major training features: mixed-precision, multi-gpu, DDP * Supports major CV architectures: from VGG to vision transformers * Lots of parameters to configure: batch size, input width and height, precision, number of benchmark iterations, warmup steps... * Logging to a CSV file