I just tried some pre-trained models to classify images, using a CPU, it takes about 10s to 15s per image.
I did not know it was so inefficient.
Would you share some details? This indeed sounds very slow, so I suppose there should be some easy ways to speed things up.
Just to give you an idea of what's possible: A couple years ago I worked on live object recognition & classification (using Python and Tensorflow) and got to about ~30 FPS on an Nvidia Jetson Nano (i.e. using the GPU) and still ~12 FPS on an average laptop (using only the CPU).
using:
cfg.apply_low_vram_defaults()
interrogate_fast()
I tried lighter models like vit32/laion400 and others etc all are very very slow to load or use (model list: https://github.com/mlfoundations/open_clip)
I'm desperately looking for something more modest and light.