
Greetings Hive minds! I'm a long time PostgreSQL contributor and author specializing in performance tuning. Now that has the PostgreSQL based HafSQL in beta, I noticed index building is becoming a serious drag. Thought I'd share my last conference video as possibly helpful: Speedrunning the Open Street Map osm2pgsql Loader
That took the roughly terabyte sized Open Street Map Planet data set and re-tuned everything for NVMe to drop loading time, which on current hardware I now have down to just over 4 hours. All the config changes to PG and Linux are documented. HafSQL's starter postgresql.conf probably needs less shared_buffers and more maintenance_work_mem to speed all its index builds up; few disk GB of max_wal_size first would help too. Hoping to get the full Hive data set running here so I can test myself.
Day job at Crunchy Data is 100% open source work like multi-cloud PostgreSQL. I run a benchmark lab and the Hive blockchain makes a nice sized data set for my upcoming work.