TL;DR: You can flash 512K firmware in under 5 seconds. Grab a modified esptool and use the new --flash_baud option with higher rate or use Mongoose Flashing Tool and modify the rate in Settings -> Advanced. For the gory details, read on.
If you are developing for ESP8266, you may be familiar with esptool, which is a tool that you use to upload your code to the device. And it will do so, but if, like Mongoose IoT Platform, your firmware is non-trivial in size, youll be waiting a non-trivial amount of time for it to upload:
$ time esptool.py --port /dev/ttyUSB0 write_flash 0 test.imgConnecting...Erasing flash...Took 2.59s to erase flash blockWrote 524288 bytes at 0x00000000 in 50.5 seconds (83.0 kbit/s)...Leaving...real 0m53.427s
Almost a minute to write 512K. Not great. Well, of course, next thing you do is dig around, find the --baud option and begin experimenting - how much faster can it get? In our experience, the answer varies with the type of serial adapter used; 230400 works all the time, 460800 works most of the time and 921600 works sometimes. So, flashing at 460k we get:
Wrote 524288 bytes at 0x00000000 in 14.6 seconds (288.1 kbit/s)...real 0m17.361s
Much better. But 17 seconds is still noticeable, can we go even faster? Theres the flash erase time, which is going to be the ultimate limit, and it looks like that is around 2.6 seconds, but how close can we get? Decent adapters like FT232R can go up to 3000000 baud and PL2303 can go even higher. So why doesnt it work? We dont know. ESP boot loader performs baud autodetection, and it just cant recognise higher rates. I could never make it work with a rate higher than 921600, and even that is unreliable when actually flashing (looks like it picks a divider that is not quite right). The code is in ROM, so even if we study the ROM dump and get to the bottom and find the bug, much like with the erase issue, we are unlikely to be able to fix it.
We know ESP UART is definitely capable of higher rates - it is clocked from a 26 MHz source and can be configured with an arbitrary divider. At lower values the data output is not a square wave, so 52 Mbaud is unachievable in practice, but 3(-ish) or 4 Mb it can certainly do. So the only problem is configuring with a correct divider and we can all but eliminate the transfer delays.
But how to get our code running on the device so we can control the UART? The answer is already contained in esptool - it uses a custom bit of code to read flash contents. Called SFLASH_STUB, it is a compiled chunk of (presumably) hand-written assembly code that will read and send the contents of a specified region of flash to the serial port. While the code itself is useless to us, from the implementation of the read_flash command we learn how to upload and execute custom code. Now the only problem is writing it.
The initial plan was to write just the bit that changes UART baud rate and jump back into stock flasher. But then we thought about other improvements to flashing that would also be nice to have. For example, checksums. Stock flasher uses a 1 byte XOR as a checksum which we would not be inclined to trust, especially at higher transfer rates were looking at. Also, if the change to firmware is small, it would be nice to avoid rewriting all of it and only rewrite the changed parts. But reading out the contents first is a non-starter, its be a pure waste of time if it turns out that in fact we need to rewrite most of it. However, reading flash without sending data out to the serial port is very fast, so we thought an rsync-like mechanism, where only digest of existing data is sent, would be ideal. This, however, means that we need to write a substantial amount of code.
Assembly is not a great language to be working in, however. C is much more preferable to us. Can we develop the SFLASH_STUB-like snippets of code in C? With some reserch and a custom linker script, it turns out, we can. You can find our stub development environment here.
To make an already rather long story short, we ended writing a custom flasher stub, which can perform all perform erase (minus the bug, so no more workarounds!), read, write and MD5 digest operations (CRC32 would probably be enough, but there is already MD5 implementation in ROM, so we just use that). When it starts, it explicitly sets the speed to a value that is passed as a parameter, no autodetection is required. It also continues to receive data while writing flash and does poor-mans flow control (without the hardware pins) so as not to overflow the buffer.
So, how fast can we write now? With our modified esptool, which uses the new flasher for write_flash command, writing can take as little as 5 seconds:
$ time ./esptool.py --port /dev/ttyUSB0 write_flash --flash_baud=4000000 0 test.imgConnecting...Running Cesanta flasher...Switching to 4000000 baud...Writing 524288 @ 0x0... 524288 (100 %)Wrote 524288 bytes at 0x00000000 in 4.8 seconds (882.3 kbit/s)...Leaving...real 0m5.286s
In this example, Im using a PL2303 USB-to-RS232 adapter which can go up to 4 Mb. Not all drivers allow setting non-standard baud rates, even if adapter supports them, so YMMV. In our experience, FTDI FT232R tops out at 3 Mb, CH341 can do at most 2 (at 3 flashing becomes unreliable), CP2102 can do 1500000 baud at most. Even if your driver does not allow setting non-standard baud rates, all the adapters can do 921600 reliably. Flashing time is now dominated by write and erase times.
One other thing that the flasher does is computing an MD5 digest of the data it receives and outputs it at the end of writing. This way, verification is performed at the time of writing and any corruption during transmission can be identified on the spot (by the sender, which computes a digest of the data it sent) - esptool write_flash verifies the checksum, so now you can be sure all the bytes were received correctly (its not a guarantee they landed - implementing verification using the CMD_FLASH_DIGEST command is left as an exercise for the reader).
Our modified esptool does not perform write optimization feature - i.e., as before, it always writes the entire image. Our own flashing tool, Mongoose Flashing Tool, however, will go the extra mile and perform an on-device digest operation before writing and only send the blocks that differ. Depending on how different the old and new images are, this can reduce flashing time can be reduced to just 2 seconds, all included:
$ time FNC --platform=esp8266 --port=/dev/ttyUSB0 --flash-baud-rate=4000000 --flash smartjs-esp8266-last.zipConnecting to ROM...Running flasher...Setting flash params to 0x240Checking existing contents...Checksumming 2512 @ 0x0...Checksumming 4096 @ 0x1000...Checksumming 564528 @ 0x11000...Checksumming 131072 @ 0xe0000...Writing 4096 @ 0x1000...Writing 4096 @ 0xe7000...Verifying image at 0x0...Verifying image at 0x1000...Verifying image at 0x11000...Verifying image at 0xe0000...Flashing successful, booting firmware...All done!Success.
(Yes, Mongoose Flashing Tool can be used in CLI mode; it will also auto-detect flash chip size, set the right params and perform post-flash verification).