We're working on a product which is using the OMAP L138 (ARM side only). There's a script used to upgrade the filesystem (located in an external SD card), which pretty much consists of a bunch of "cp -R ..." commands which overwrite files in the root directory by ones previously unzipped in a temp directory. The script is ran from "/etc/init.d/rcS" and when all files have been written, the system issues a reboot to apply the update.
Under the hood, the reboot actually resets an ATX power-supply for about 5 seconds, so power is removed from the entire board. This is also something we cannot change (easily). This is done through a secondary power-management chip over I2C. Effectively the functions "pm_power_off" and "arm_pm_restart" were reassigned and issue the I2C commands. These are part of the kernel and called from "kernel_restart() ==> machine_restart()".
We realize this may not be the preferred way of doing it and we rather have a "dual boot" type of setup for safety, but we're stuck to this inherited solution for now.
That brings me to the actual problem. Sometimes after a reboot we find that files are partially written or, worst case, the system no longer boots up! (= very unhappy clients...). There are several variations on the problem but they all point to data not having been actually written to the SD card.... Sometimes I find files with thie following pattern at the end of the file "^@^@^@^@^@^@^@" and sometimes files contain "old" data, presumably because a sector got allocated but not erased/written with the actual content yet. In case the system fails to start, I get messages of this type:
kjournald starting. Commit interval 5 seconds
EXT3-fs (mmcblk0p1): using internal journal
EXT3-fs (mmcblk0p1): mounted filesystem with writeback data mode
VFS: Mounted root (ext3 filesystem) on device 179:1.
Freeing init memory: 136K
request_module: runaway loop modprobe binfmt-ca00
The kernel seems to think it did its job though. Here's the output we consistently get from it:
//================ This is the message we get in the console: / # umount: can't remount /dev/mmcblk0p3 read-only umount: can't remount mdev read-only umount: mdev busy - remounted read-only umount: can't remount /dev/root read-only umount: can't remount rootfs read-only The system is going down NOW! Sent SIGTERM to all processes Requesting system reboot Restarting system. //=========== And when we hit the power-button (kernel_power_off() called): / # umount: can't remount /dev/mmcblk0p3 read-only umount: can't remount mdev read-only umount: mdev busy - remounted read-only umount: can't remount /dev/root read-only umount: can't remount rootfs read-only The system is going down NOW! Sent SIGTERM to all processes Requesting system poweroff Power down.
I'm currently going though the kernel to find what *exactly* is being done, but if anyone could give me an insight or has a similar experience, a reply would be HUGELY appreciated.
Thanks
Dirk