File System Corruption !

I use a home-computer as my Web-host, which I named ‘Phoenix’. This was actually a computer built in 2008, and is one of the oldest still working for me. So far it has continued to work reliably.

Until several years ago, this computer was actually named ‘Thunderbox’, and was running Kanotix / Thorhammer. But I when I wiped it completely and installed Kanotix / Spitfire on it, I not only renamed it, but rebuilt its software from zero.

What just happened to me today on ‘Phoenix’, is that I fired up a routine K3b -session, in order to back up some music from Audio CDs which I had actually purchased in the 1980s, and that the GUI application showed me a message saying that it could not locate the utility named ‘dvd+rw-tools’ ; I should try installing the package (even though it’s a dependency and was installed.) After I reinstalled that package, K3b told me that it could not locate ‘cdrdao’ (even though it’s a dependency and was installed). I needed to reinstall that as well, after which I explicitly needed to tell K3b to rescan the hard-drive, to find all its back-end utilities again.

Because on this computer I had never customized the back-end utilities which K3b uses – unlike what I had done on ‘Klystron’ – and, because I had never changed the variables in question, this actually points to file-system corruption. I had last used K3b on ‘Phoenix’ several months ago, and with no error messages or warnings. Further, those utilities were among the first packages I ever installed, when I resurrected this computer as ‘Phoenix’, and the idea that the oldest data on the hard drive, would be the first data forgotten, even though never deleted, is also consistent with file-system corruption.

Usually, I’d expect FS corruption to stem from power outages. But we haven’t had a power-outage in a long time, and the last time we did, the Extension 4 File-System on the machine just seemed to repair itself correctly. Because Ext4 is such a tightly-meshed file-system, the report that it has been repaired, can be believed. If it was not repaired on boot, the computer would refuse to boot.

And so, even though this is not typical, I’d say that in this case, the FS corruption is actually secondary, to the fact that the actual Hard-Drive is aging, and has caused some I/O errors. We only get to see I/O error messages, if we’re running something from the command-line; I believe that when we’re running a GUI application, those just stay buried in a system log somewhere.

But what this seems to spell, is that eventually – sooner than later – ‘Phoenix’ will die completely, and this time, there will be no resurrecting her, because the problem will be in the hardware and not in the software.

(Update 1/16/2018 : But. there’s another possible explanation… )

Continue reading File System Corruption !

New Potential in my Galaxy Tab S.

As I wrote in this earlier posting, I believe that my ancient Galaxy Tab S First Generation (Android) tablet has finally bit the dust, due to File System Corruption.

This is not just due to the wonky behavior following the latest hard-boot (during a routine software-update no less), but also because over the years, the amount of free memory on it has become very small, even though I have uninstalled most of the apps that were once installed on it. Typically, failure to reclaim unused space, is one of the earliest signs of FS corruption.

Depending on how one looks at it, visualizing this as a terminally-crashed tablet can have its upside, especially since my needs for a stock Android tablet are met in my ownership of a Pixel C. What this actually means is that I can regard the present installation on the Tab S as expendable, which means that eventually, I’d be able to experiment with it as I wish. For example, I might eventually want to root the Tab S, or install Linux on it…

But one fact which I seem to gather after some reading, is that even to install Linux on Android-capable H/W, does not take place as it does with PCs, where the Linux system is the only running O/S. Instead, Android-capable H/W seems to require that one way or another, we install Linux alongside Android. Thus, unless I get Android working again on the tablet first, I also wouldn’t be able to get Linux working on it.

Further, rooting an Android device does not (necessarily) cause the firmware to be flashed. Instead, if we want to flash the firmware, this is a separate operation which can also be done.

Long story short, whether I’ll ever be able to resurrect that tablet, depends on how it’s partitioned, and in which partitions I suspect the FS corruption has hit.

For example, if I just root, then the data and apps on the ‘/sdcard’ will not even be affected, and if that’s where the FS corruption was, the FS corruption will just stay there. The similar effect would take place, if I was to flash a custom ROM Image on the device, which would affect the ‘/system’ partition and/or the ‘/boot’ partition, but not affect the ‘/sdcard’ .

I would strongly suspect that the FS corruption is on the ‘/sdcard’ , especially since I wasn’t updating the O/S, when the latest crash took place. And what that would seem to suggest, is that my next step should actually be, just to perform a factory reset: The simplest advice sometimes given! That should reset the ‘/sdcard’ , and the ‘/data’ partition, mainly.

If that causes the tablet to become stable again, Then I could proceed next, to root, to install Linux, etc., etc., etc.. Or, I could just recommence with Android, with more reclaimed memory, and with a stable tablet…



My Galaxy Tab S has just Succumbed to File System Corruption.

One fact which had remained a reality in my life, was that I still owned a Samsung Galaxy Tab S, First Generation (Android Tablet). This mere fact was actually what prompted me to acquire a Pixel C Tablet some time ago, since we must eventually plan for the failure of old technologies.

At the same time I’ve mentioned the subject often, that when a computer is interrupted from running, so that it needs to reboot, but without having been given proper shutdown instructions, or without otherwise having carried out an orderly shutdown, this is known as a Hard Boot, and it can lead to File System Corruption. Actually, every Hard Boot is a File-System Event, which the drivers for the mass-storage device try to resolve to the best of their ability.

The way in which the mass-storage drivers typically repair FS corruption, is just to delete whatever file-fragments, whose meta-data is inconsistent, until the File System is consistent again. In some cases, critical files can end up just missing.

Well it just happened to me today, that my Samsung Tab S underwent a Hard Boot, and that unlike the previous times, this time it led to severe File System Corruption. This has left my Tab S in a state where it cannot be used anymore, and since other Computer Experts do not know about the existence of File System Corruption, I see no way to make that old tablet operational again.

That old tablet, sadly, must now go into my Tablet Graveyard. It has served me well until this morning, but is essentially a lost cause now, just as my Toshiba Thrive was, years ago. May the tablet be remembered for good deeds.



One Important Task of the File System is, to Manage Unallocated Blocks.

When we visualize deleting large files, it is tempting to visualize that, as only having to ‘unlink’ blocks of data, which used to belonged to that file, from the File System.

But in reality, there is a major additional task which an FS must manage. Any File System possesses ‘a pool of unallocated blocks’, which I would refer to colloquially as ‘the Free Pool’. This pool needs to exist, because every time a new file is created, or an existing one appended to, such that a new block needs to be allocated to it, that new block needs to have some origin, with the foreknowledge that it does not already belong to some existing, allocated file.

Such a block is taken rapidly from the Free Pool.

Therefore, when we delete an existing file, all its allocated blocks additionally need to be added back to the Free Pool.

If FS corruption has taken place, then one of the earliest, most common practical signs of this will be, a failure to count de-allocated blocks, as blocks of hard drive capacity, which are free again, for the new formation and extension of files. This happens, because the unlinking of the deallocated blocks, always takes place before their addition back to the Free Pool. The way a File System Driver behaves would become even more precarious, if it was generally to add deallocated blocks to the Free Pool, before unlinking them from existing data stores.

More specific explanations of how a File System works, are hampered by the fact that many different File Systems exist, including ‘FAT32′, ‘ExFAT’, ‘NTFS’, ‘ext3′, ‘ext4′, ‘reiserfs’, etc..

When I studied System Software, the File System that was used as an example was, ‘UNIX System V’. That predates ‘etx2′. My studies did not include other practical examples. But what I have read about ‘FAT32′, is that it has conceptual simplicity, which may reduce the consequences of FS corruption slightly. OTOH, ‘FAT32′ is not a “Journaling File System”, as was not ‘System V’.

By comparison, later versions of ‘ext3′, and ‘NTFS’, ‘ext4′, are all examples of Journaling File Systems. What happens therein, is that data written to the HD by user-space programs is not written directly to the FS, but rather written to an incremental Journal kept by the FS Driver (residing of course in kernel-space). At a non-specific point in time, the Journal is Committed to the FS, an operation, in which a subset of Journal entries is read from the Journal, in such a way that they perform Atomic operations to the FS, after which the actual File System is consistent again, but due to which not all the Journal Entries have yet been Committed. And then, those additional Journal Entries are played back, only when the next attempt is made, to commit the Journal as it stands again.

This can reduce the risk of FS Corruption, because the interruption of the kernel would need to take place, exactly during an interval during which the Journal is being Committed, in order for real Corruption to occur.

But, Why then Is ‘NTFS’ a Journaling File System, if the issue of FS Corruption did not exist under Windows?