Why a Hard-Boot is often Overkill.

In order to understand this posting, the reader first needs to understand something, about how the (volatile) memory in a computer is organized. Through the use of virtual addresses, it gets organized into user-space and kernel-space. User-space includes programs that run with elevated privileges, but also the GUI-programs that display widgets and the like on the screen. Kernel-space is reserved for the kernel of the O/S, as well as for kernel-modules, that act as device drivers in practice.

But in addition to that, modern architectures also have I/O chips – which the kernel modules would be responsible for – that run “firmware”. These more-complex I/O chips will often perform complex tasks, such as the encryption built-in to Bluetooth, without requiring that these tasks run on the CPU, or in RAM. In order to do that, I/O chips capable of doing so need a smaller program of instructions, that actually run on the I/O chip. This program is typically no longer than 1-2 KB.

(Edit 06/09/2017 :

I suppose the fact should also be acknowledged, that in order for the firmware actually to do anything, the I/O chip should also have a somewhat larger region of memory – call it a Buffer – which stores data and not code. In the case of an intelligent Bluetooth chip, that buffer would logically store whatever encryption keys are currently being applied to data, which is being streamed to and from the chip… )

So in practice, a situation which the user can run in to, is that by unpairing all his (external) Bluetooth devices, and then turning Bluetooth Off from the settings GUI, and next turning BT back On, he can Fail to Reset the Bluetooth system to its original state. The user only gets to see what the GUI is showing him – which is being controlled by programs running in user-space.

What the user may not realize, is that the way kernel-space is often organized, turning Bluetooth Off in user-space, will fail to unload the kernel-module itself, responsible for operating the Bluetooth Chip, that has firmware running on it, for the sake of argument.

The actual kernel-modules first load when the computer boots up, and the first thing they do, if they are of the sort that use firmware, is to load that firmware onto the I/O chip – and it’s often patches to the firmware that get loaded, because the default firmware is burned-in to the I/O chip.

If a tech-support person receives a request for help from a user, there is some possibility that the tech-support person fears the worst, that some sort of firmware, or other bits of data, might get stuck in the I/O chip somehow, and that the only way to reset the I/O chip, is to recommend a Hard Boot.

But what may sometimes just be necessary, is that all the kernel-modules be reloaded (into RAM), so that each of them initializes their respective I/O chips as well. The most gentle way to make that happen, is a Soft Boot, which tells the tablet to perform an Orderly Shutdown, and then a normal Boot-Up, without the actual hardware being powered down.

Now, If I found that doing so did not solve my own problem, and feared that corrupted bits of firmware might be hanging in my I/O devices, the next thing I might try, is just to perform an Orderly Shutdown – assuming the device is still working well enough to let me do so – and then to let the powered-down device sit on the table for a few minutes, before using the power button normally, to make it Boot-Up again. Sitting for a few minutes powered down, will allow any charge to drain from the supply-capacitors – even on tablet – so that the last volatile bits expire.

My device would recently have allowed me to do so, after simply failing to re-pair with its BT Keyboard, but working fine in every other way.

The only additional thing a Hard-Boot really does, is force a File-System Check, when the device Boots-Up again. And, Force a device to reboot, which is no longer working well enough, to follow the user’s command to do a Soft-Boot.

Dirk

(Edit 06/09/2017 : )

I suppose the fact should also be acknowledged, that a shift in the naming of what ‘Firmware’ means did take place, during the introduction of smart-phones and tablets.

A smart-phone or tablet does have volatile memory – RAM – as described above, but like older forms of computing technology, also needs to have non-volatile, mass-storage, which I did not describe above. This is where we get to store 16GB, 32GB, 64GB of data more-or-less permanently…

Like older forms of computing technology, our smart-phones and tablets store their System Software – which includes their kernel-modules, their kernel-images, and their whole O/S – under Android the Dalvik VM is part of an O/S version – in non-volatile, mass-storage.

But, as these devices were introduced, the way in which non-volatile memory was implemented was also undergoing a shift, from magnetically-based hard drives, to SSDs (Solid-State Hard Drives). SSDs today are essentially EEPROMs – Electrically Erasable, Programmable Read-Only Memory.

So all your photos are theoretically being stored in a ROM-chip, that can be selectively erased electronically.

What this also meant, was that every time a System Update was being distributed to a smart-phone or a tablet, effectively, it was being distributed as a ROM-image, thus earning it the dubious distinction of being a Firmware Update.


 

Just as it was possible long ago, for the kernel to mount more than one logical drive, this is possible today, when all the logical hard-drives are physically being stored on a Partitioned EEPROM. Older technology tended to store the System Software on such a separate volume, and then changes to that volume could in fact be distributed as ROM-images.

It remains perfectly feasible, for one Micro-SD-Card, or for one USB-Flash Drive, to have more than one partition, which might confuse people who carry them in their pocket as practical storage, but which is essential when they are being used as The SSD of a smart-phone or tablet. This is also why a phone advertized as having 64GB of mass-storage, only shows up has having “54.92GB” of available storage, as viewed with file-management apps. The rest belongs to (fixed) system partitions.

What tends to happen in more-recent smart-phones and tablets, is that the mounting of logical volumes is more amalgamated, and also harder for novices to follow. There may no longer be one physical volume, which corresponds to the System Software directly. And, these devices like to manage all their symlinks, to make the storage locations behave more-completely like a single storage-location. Android doesn’t like it, when its users try to manage the symlinks.


 

The problem with this naming convention is, that System Software specialists still need to distinguish, between Images that merely define a device’s mass-storage, Images that define the RAM one program takes up when it’s running, and actual Firmware Blobs, that need to be loaded onto the I/O chips themselves – if they are the kind that use firmware – and which first needed to be stored as such in non-volatile, mass-storage, for the kernel-module to have a source to load them from, to the I/O chip.

For that purpose, to continue to refer to any type of System Update as a Firmware Update, just because the device’s non-volatile storage happens to be a derivative of an ancient ROM technology, becomes infeasible from the point of view, of analyzing the System Software.


A reasonable question to ask might be, why the size of a firmware-program can be so short.

This would be because I/O chips that have firmware, still rely on their computations mainly being defined by complex logic-circuits, and by concurrency. Their operation has been sequentialized because it had to be, but sequentialized only so far. They just rely on some sort of microprogram to control numerous circuit-blocks.

This would also mean, that each firmware-instruction triggers a whole circuit-block in one shot, and that if there were in fact 1000 firmware-instructions, this would potentially slow down the speed of the circuit by a factor of 1000. I don’t think that the firmware-instructions have addresses or branching as such, maximally executing instructions conditionally, but not branching conditionally, and therefore the firmware would not contain loops. The entire microprogram would constitute one loop, or one subroutine.

A generic, microprogrammed circuit has some basic differences, that also make its firmware-instructions different in nature, from general-purpose CPU instructions. It does not power circuit-blocks on and off per instruction, but relies on such logic circuits as ‘a latch‘, to time the reading of bits from an input bus to one circuit-block, and ‘a tri-state output’, to time the writing of bits from the circuit-block to an output bus.

There could even be a latch, to feed the data on the output bus, back to an input bus.

A detail about microprogramming which I can conjecture, is that the part of the chip that interprets the microprogram needs a decoder. I.e., the chip could have 64 latches + tri-state outputs to control, each of which is receiving a logical high or a logical low value from the decoder at any point in time. This could mean that each micro-instruction consists of a 6-bit integer that identifies which line of its decoder’s output to change the logical state of, plus 1 bit to state, whether that output-line should go low or high…

Also, the control-gate of a latch only needs to go high briefly, for one clock-cycle, unlike the control-gate of a tri-state output, which needs to stay high until explicitly lowered again. But, tri-state outputs are only useful, if a group of several are potentially writing to the same output-bus, so that raising one also implies lowering the others, that would write to the same output-bus. These observations would mean that an 8-bit micro-instruction could be optimized, to control up to 256 latches + tri-state outputs.

(Edit 06/10/2017 : )

Another reason for why the firmware can be kept short, is this underlying assumption.


 

The fact occurs to me as an afterthought, that if an I/O chip had the capacity to execute let’s say 37 instructions, because those instructions could be sent to it by the CPU, over the I/O bus during normal operation, its firmware should also be organized into 37 subroutines.

The most efficient way I could think to implement this, is if each firmware-instruction did after all possess an address, and if a vector-table resided at address zero, from which the microprogram-interpreter could first fetch the starting address of the firmware routine, that implements any one of the hypothetical 37 instructions. In that case, one of the firmware-codes would also need to signal, that the given routine has finished executing…

But I’m assuming that many of the instructions the I/O chip would routinely need to execute, are simple and short, also in the number of microcodes that would tell it how to do so.

And, If we’d like the microprogram-interpreter to start executing a routine within one clock-cycle of receiving its identifier, we’d probably need to make it even more complex in itself, such as to extract the entire vector-table when the microprogram loads, into separate registers, so that a register-number can lead to a microcode-address, and then to a fetched microcode, all in one tick.

OTOH, We might find that to lose one clock-cycle, just to find the starting-address of a microprogram-routine, and then to start executing the microprogram on the second clock-cycle, is adequate. In that case, we can keep the decoder slightly simpler.

 

Print Friendly, PDF & Email

2 thoughts on “Why a Hard-Boot is often Overkill.”

Leave a Reply

Your email address will not be published. Required fields are marked *

Please Prove You Are Not A Robot *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>