2017년 12월 4일 월요일

ConvNet의 인퍼런스 속도를 높이는 4가지 방법



  • Play with your model and training hyper parameters. You might be able to use a lighter model without a significant degradation in performance, for example, decrease the network’s depth, width, number of filters, floating point accuracy… etc. This and methods #2,4 will allow you to increase your batch size and increase your inference bandwidth.
  • Explore network architectures that are optimized for ‘lighter’ hardware such as Squeeze Net.
  • NVIDIA offers a network inference optimizer called TensorRT that is designed exactly for your need - optimize your network for deployment.
  • This is a more novel solution to increase inference speed but it should work - a famous paper by Geoffrey Hinton allows you to distill the knowledge in the network and compress it into a smaller network. This will somewhat degrade your performance, but the paper showed that the accuracy penalty is not very significant compared to the savings in model complexity.
  • 댓글 없음:

    댓글 쓰기

    gpustat command

    sudo apt install gpustat watch --color -n0.1 gpustat --color