Here is the article:
Ethereum: Syncing a Bitcoin Node with Multiple Cores
I am syncing my node on 32 CPUs, and it is very slow.
Obviously, the syncing is not done in parallel. I wonder if there is a fundamental barrier to the “divide and conquer” method.
So let’s say I have a large blockchain to sync, like 100 MB or more. If I were to use the traditional method of syncing all the nodes at once, it would take a ridiculously long time, maybe even days or weeks.
But what if I could break this task into smaller chunks? For example, I could create multiple processes to quickly sync some nodes, while others continue on their own.
This is exactly how some people have implemented this approach using the “divide and conquer” method. Here’s a glimpse of what this might look like:
- Get a block of data
: Take a piece of the blockchain and sync it using a node.
- Spawn a new process: Launch a new process that gets to work syncing this smaller block of data.
- Use multiple cores: Use as many CPU cores as possible to make progress on this new task.
- Move blocks: Periodically move blocks of data from the old process’s focus area to the node with less load, so the new process can work on them.
This approach allows nodes to stay in sync at their own pace, without having to wait for everyone else to catch up. However, there are some fundamental obstacles that make this method difficult:
- Interprocess communication: Communication between processes can be tricky, especially when dealing with large amounts of data.
- Resource Management: Coordinating the allocation and use of resources (e.g., CPU cores) is essential, but can be difficult to manage effectively.
- Global Consensus: Ensuring that all nodes have agreed on the most recent data and the synchronization process requires a certain level of coordination, which can be difficult to achieve.
Despite these challenges, some experienced developers are experimenting with this “divide and conquer” approach. For example:
- Binance Smart Chain: Binance used a similar method, splitting the blockchain into smaller chunks and processing them in parallel.
- Polygon
: Polygon is another project that has implemented a “divide and conquer” synchronization strategy, using multiple processes to keep some nodes running quickly.
While this approach may not be suitable for all use cases, it is an interesting example of how people are exploring other ways to make syncing more efficient and scalable.