ConvNet의 인퍼런스 속도를 높이는 4가지 방법

- 12월 04, 2017

Play with your model and training hyper parameters. You might be able to use a lighter model without a significant degradation in performance, for example, decrease the network’s depth, width, number of filters, floating point accuracy… etc. This and methods #2,4 will allow you to increase your batch size and increase your inference bandwidth.

Explore network architectures that are optimized for ‘lighter’ hardware such as Squeeze Net.

NVIDIA offers a network inference optimizer called TensorRT that is designed exactly for your need - optimize your network for deployment.

This is a more novel solution to increase inference speed but it should work - a famous paper by Geoffrey Hinton allows you to distill the knowledge in the network and compress it into a smaller network. This will somewhat degrade your performance, but the paper showed that the accuracy penalty is not very significant compared to the savings in model complexity.

이 블로그 검색

Gromit's Cabin

ConvNet의 인퍼런스 속도를 높이는 4가지 방법

댓글

댓글 쓰기

이 블로그의 인기 게시물

파이썬으로 Homomorphic Filtering 하기

충격파

파이썬으로 2D FFT/iFFT 하기: numpy 버전