Lesson’s learned: DeroGold to cope with 65k+ transactions in the tx pool

Blockchain. Cloud. Consulting. Development. Distributed Computing. Hosting.

Lesson’s learned: DeroGold to cope with 65k+ transactions in the tx pool

15/07/2019 Uncategorized 0
DeroGold performance tuning

There is no attack in a decentralized privacy network. It’s just either regular transactions or fusion transactions.

Of course, there could be elements and individuals who would deliberately perform such activities that would cause some degree of a trouble for a young distributed, decentralized network that consists mostly from unmanaged low powerfull nodes.

And that was exactly what happened to DeroGold a few months ago. One day we woke up in the night looking at network & node monitor. Hash rate down, pools mostly not operating, alternate chains on the network, only about a 4-5 nodes respoding and keeping up.

If you face a situation like this, the human factor hits in. There would be splitting into more or less two dominant groups – the first group who would claim the only way from a bad scenario as the above is more control, centralization; and there would say let’s cope with this.

The author of this post is biased believer in decentralization, geo-distribution and lowest amount of hand-holding and controlling. So we said well, let’s take this as a free stress test. What if our network grows so much over the next few years that we will have to cope with 60k+ transactions on a daily basis?

We had investigate. Why is the software part, largely sitting and waiting on the powerful hardware (we run xeon and i7 servers with 32+ GB of RAM on SSDs to support a few networks / nodes)?

The answer was in how the daemon uses the database. Historically, as DeroGold is a fork of TurtleCoin, and TurtleCoin historically a fork of Bytecoin, there seems to have never been done any optimization on the use of read / write buffers to Rocksdb (which is a high-performance scalable database developed by Facebook) which both DeroGold and TurtleCoin use.

At some point during the events, we discussed with @sniperviperman#2620 whose nodes were keeping up that he increases the buffers significantly. That sounded like a part of the key to optimization. The defaults in the original code were set to size of tens & low hundreds of megabytes for read/write buffers respectively, and the number of db threads to 4. That could work for a network with here and there a few transactions per minute.

For a DeroGold network, with 20 seconds block time, and at the time of events a range of 400 – 500 transations per block, this cannot be enough. Symptoms? The node stops responding to RPC calls first, then fails to keep up with the chain (high I/O), CPU largely sitting idle.

We tested various settings and for our 32GB server we set both DB read and write buffers to 8092 MB. We tried different settings for number of DB threads, and finally found one that worked well – 10 DB threads.

With this, it was also necessary to lift the Linux limits on the number of open files.

Those settings unclogged the daemon to be able to start processing transactions and resynchronize with the chain.

Subsequently after these events, we added the increased DB buffers & threads to our default config file, and also submitted Pull Request to both DeroGold and TurtleCoin code that changed the default settings to values that are more corresponding to nowadays hardware and match the requirements of operating a 24x7x365 privacy coin.

Bottom line: The new defaults may not work for small home node / unmanaged node operators. Some of them did not actually like the change, seeing the requirements as too high. It is fine. Everyone can scale back those default settings. However, we have to set the default vauleus to the minimum requirements of each of the network to ensure stable and continuous 24×7 operations. When a peak hits the network it will be the strongest nodes that will have to keep up and ensure availability of the network.


Donate 100 TRTL, powered by TurtlePay and TurtleButton: