On March 1, 2022, I had to put HiveSQL in maintenance mode to make a major change to the database structure. You can read more about it here.
This unexpected maintenance, which lasted almost 3 days, is now over and HiveSQL is back alive and kicking!
Why did it take so long?
As explained in the announcement post, the purpose of this intervention was to modify the type of the field which stores the transaction reference in about fifty tables.
Easier said than done when you're managing a 3TB database with tables that store millions of records, sometimes even billions.
The indexes nightmare
HiveSQL contains many indexes that greatly speed up the information retrieval and query execution processes.
When you make changes to a field that is indexed, the server must update this index with each change and sometimes rebalance it. This process is extremely expensive in terms of resource usage and can significantly slow down operations performed on a large number of records.
It is therefore often more efficient to delete an index, even at the expense of performance, then to make your changes, and finally to recreate this index.
But sometimes it can be counterproductive. The relevance of these actions must therefore be assessed with the greatest care.
Add to this that these operations have to be done manually and often take a long time to complete.
The transactions log nightmare
You should know that to guarantee the consistency and integrity of the data contained in a database, the SQL server records in a separate log file each modification that you make to them.
This generates a huge amount of disk activity and also consumes a lot of additional storage space, sometimes more than the volume of data handled.
Although the available storage space is currently sufficient for the usual activity of HiveSQL, it turned out to be a bit short for logging billions of changes at one time.
Therefore, I had to adapt and cut these modifications into smaller batches, which had to be executed sequentially with even more care in order not to make errors.
Data security and recovery
Finally, before carrying out all these operations, I took all the necessary precautions to be able to roll back everything in the event of a problem. This means taking a backup of the database before any intervention, which for a 3TB database can easily take several hours.
A great idea because I almost needed it. Indeed, fatigue helping, I made a mistake at the very last moment which almost compromised all the work already done.
I was really scared at that time, not because part of the database would have been fucked up, but because hours of work would have been ruined and it would have been necessary to start all over again after restoring the latest backup.
Fortunately, I found a workaround to fix my error that allowed me to complete the maintenance with only a few extra hours of work.
All this to explain to you that maintenance consists of an uninterrupted series of successive action and verification steps, of variable duration, carried out manually and during which I remained permanently in front of my screen to check that everything was going well, ready to react quickly in case of a problem.
Nothing could be more silly than going to sleep for a bit and, during that time, the operation you just launched ends and your infrastructure is waiting idly while everyone outside has only one question in mind: when back?
I need to sleep!
Those who know me know that I'm a fan of Power Naps. These are naps of a maximum of 20 minutes so as not to fall into a deep sleep. I have used and abused it. That's what kept me going. In the last three days, I think I slept a maximum of 8 hours in all.
But you should also know that this cannot replace a full night's sleep. The periods of deep sleep contain cleaning cycles for your brain necessary for its proper functioning, and these cycles are not present during short periods of sleep.
HiveSQL is back and all apps and processes relying on it running are fully operational again. That's been my only concern lately and I did not allow myself any respite, or very little. I am relieved but exhausted.
Time to start new brain cleaning cycles.
Meanwhile, I wish you all a good day!