MaixPy Run Big Size Flash Model !
Why we need run model from flash?
K210 have limited ram 6+2 MB,
MaixPy Firmware cost 1~2MB, and pic buf cost another 0.5~1MB
the rest ram space is tight for AI model
sometimes we have to choose minium function MaixPy version to run bigger AI model.
Now, we have the new choose, directly run from Flash!
this new version MaixPy support run AI model in flash, without load it into ram, so you have more ram to do normal things, and AI model only limited by Flash size.
task=kpu.load_flash(model_addr, is_dual_buf(0/1), batch_size, spi_speed)
- model_addr: flash addr store your model, note, you need flip the model endian, use convert_le.py to convert normal model. and only support V3 model now.
- is_dual_buf: 0, single buf, use less ram and slower speed; 1, dual buf, more ram and faster speed.
- batch_size: when choose dual_buf, you need set load batch_size, suggestion value is 0x4000~0x10000, you can test out best value for your model.
- spi_speed: when use flash runner, we will temporary set flash to high speed mode, set the spi speed you want. the value should <= 80000000
And now you can use normal kpu.forward to calculate your model.
here is the test script in attachment,
you need burn two kmodel first, and put tiger.jpg to the spiffs
and set spu freq to 480M, run the script, the result is:
(481, 398) #CPU freq & KPU freq #ram run model test label idx=292 load 421 ms, forward 40 ms flash run model test (single buf) SPI freq 80166666 Hz label idx=292 load 2 ms, forward 106 ms flash run model test (dual buf) SPI freq 80166666 Hz label idx=292 load 2 ms, forward 83 ms
you can see normal ram run cost many time to load model, but very fast in infer.
single buf flash run cost 2.65X time
dual buf flash run cost ~2X time
When use QSPI PSRAM (133M), the dual buf run cost 53ms, about 1.3X time
And we will test OSPI PSRAM(133M*2), it should cost in 45ms, about 1.1X time.
So, it is possible to run model up to Flash size(16M), and won’t lost too much speed.