Accepting payments at scale on Solana

Solana and the quirks

Solana and the quirks

Accepting payments seems to be a trivial task - get the list of recent transactions of an address, identify those that are new, do the processing accordingly.
There are complications, however.

Let us support a platform with thousands of users, give each a unique deposit address, chose a rich ecosystem such as Node.js, and add this blockchain to the mix - let’s take Solana. What might go wrong?

Solana is producing a block each 400 milliseconds in average, so this adds a significant constraint on possible design choices.
To retrieve the recent transactions for each deposit address, and repeat the process every second or so, is not scalable. Much better is to scan each and every new block on the chain, go through transactions, identify those relevant to the platform then process.

Getting started

Before even starting, the first choice must be made - the RPC provider.
The Solana public RPC endpoint, is free but slow and very limited by request rate, so does not suit to solve this problem.
Better to go with a service such as quicknode.com.

Retrieving each block is an intensive task. This will cost $150 roughly each month.
No. Selfhosting own Solana node is nowhere near cheap, since it requires a dedicated host with 256 GiB to 1 TiB RAM, depending on configuration details.

Obvious attempt

This is what it looks like getting every block sequentially.

 1const web3 = require("@solana/web3.js");
 2
 3async function get(connection, index) {
 4    const start = Date.now();
 5    let res = null;
 6    try {
 7        res = await connection.getBlock(index, {
 8            maxSupportedTransactionVersion: 0,
 9            commitment: "finalized",
10            rewards: false
11        });
12        console.log(`${index}:${res.blockhash} - ${Date.now() - start} ms`);
13    } catch (err) {
14        console.log(`${index}:emptyslot - ${Date.now() - start} ms`);
15    }
16
17    return res;
18}
19
20async function main() {
21    const start = Date.now();
22    const url = "https://key-part-1.quiknode.pro/key-part-2";
23    const connection = new web3.Connection(url, "confirmed");
24
25    for (let index = 0; index < 24; ++index) {
26        const block = await get(connection, 306397256 + index);
27    }
28
29    console.log(`total: ${Date.now() - start} ms`);
30}
31
32main()

The output.

 1306397256:38dyuUps2AG6iqW6T8wWjegnrCAtw3kG5JfN2SNCV3YK - 1759 ms
 2306397257:AusWP4vH3VkczWHz4C6K4S8pBZPG7apbyEy1oCr7YFWy - 1039 ms
 3306397258:74GUsvi3uuL6ophpknf4K1BpEBQEjEqWSFyTGctiADJC - 875 ms
 4306397259:HSXwVhnyzHg5j4LbmWXJdjTqvSZj1P7tUHJFhgeYcb2x - 1126 ms
 5306397260:6NossGi6JdFiQZbiREuh9F3SzesJrFcLYqrtwG56MQPb - 1002 ms
 6306397261:F9CxWpv8Us5CGV3foA7H47EFEVzyrGXr4LwHMNft4seP - 1025 ms
 7306397262:8N1UCXW4bRPZ2CyqoX7bMo1oNbn3TgqLe15UhyifeFGb - 857 ms
 8306397263:8NUKaiGuCH7qJgvqAX8XaM7ti3ADhzNAzGjrxfjj61TG - 954 ms
 9306397264:FaSJyMr4i3Uj5ZmFsjoTqHKb1L1df8L62nLZS9C6Q37E - 1045 ms
10306397265:537NpzEVaVs98m2iU9j3DyWPV48cQZZUmpiY2Bp6wBxd - 886 ms
11306397266:7NbVdPC5XGETJmKXnLYZeHBPHn4f6xwZwckzrG1UNzBH - 1200 ms
12306397267:E675HHi7cczR2sd7eLKxgvEmvMiPCFQMdhVu2CKNERtK - 649 ms
13306397268:2pfVswukPpsgQ74bphciMcNGBUAeVvmWeKBwrB1WRY3P - 804 ms
14306397269:Fk8FbAkAeLUpxRQBvSMMo8cZfkT4XQ4WwjudQBYK72na - 911 ms
15306397270:HPudvUyYiDYQWSXc6qyyjodhBYYnNnT3f7HtiWiwPx8q - 317 ms
16306397271:912LhRCHxJp84BFdeC8QuBX97xuV57VddzCgm9ZhuXQv - 754 ms
17306397272:7wbGUbbN6RoAbTfRZ5wAeMfjhVDLBfeG3o2Hf8aaWuZA - 1178 ms
18306397273:41qSJAb4L6vrVTB1uBQ6eUqXS3cMMsZNwCgRTaHGGcuv - 941 ms
19306397274:AxbLT7Y1P5AaP5Lm9JVYAA4iz21SdvcXzXWPGGU16unk - 875 ms
20306397275:C8C8HQq3JEwJ6wiuRuqPmpy1Q4ch3KF7B72DbUahajdQ - 769 ms
21306397276:URddsoYTYvu6znBYQL5xNFfUz13rokKcX37bJY4XqJb - 923 ms
22306397277:9wf3uy7fbUinUUgqJgkSop1bkyYQpeu4p4tdMth9L1iV - 904 ms
23306397278:Etbq6BwPeZSnFW7x9sYkDThwA5ejChfqcPV5JHb7gRNL - 954 ms
24306397279:GiEpin5bNr3zDAng51DXDAuooVZTyHb5DNhg2bc94tYP - 1049 ms
25total: 22799 ms

Then a run for 240 blocks it takes 229,558 ms total, and the CPU usage was around 60%.
This is not even going through transactions, and in average takes about 950 ms per block. That means never being able to keep up with the rate new blocks appearing in the network.

Naive parallel pages

Let’s change it, so the requests are sent in parallel by pages, by substituting the lines 25-27 in the above code block by the following.

 1const page = 4;
 2let promises = [];
 3for (let index = 0; index < 24; ++index) {
 4    if (promises.length === page) {
 5        await Promise.all(promises);
 6        promises = [];
 7    }
 8    promises.push(get(connection, 306397256 + index));
 9}
10await Promise.all(promises);

The output.

 1306397257:AusWP4vH3VkczWHz4C6K4S8pBZPG7apbyEy1oCr7YFWy - 2132 ms
 2306397256:38dyuUps2AG6iqW6T8wWjegnrCAtw3kG5JfN2SNCV3YK - 2637 ms
 3306397258:74GUsvi3uuL6ophpknf4K1BpEBQEjEqWSFyTGctiADJC - 3015 ms
 4306397259:HSXwVhnyzHg5j4LbmWXJdjTqvSZj1P7tUHJFhgeYcb2x - 3601 ms
 5306397262:8N1UCXW4bRPZ2CyqoX7bMo1oNbn3TgqLe15UhyifeFGb - 1485 ms
 6306397263:8NUKaiGuCH7qJgvqAX8XaM7ti3ADhzNAzGjrxfjj61TG - 2030 ms
 7306397261:F9CxWpv8Us5CGV3foA7H47EFEVzyrGXr4LwHMNft4seP - 2575 ms
 8306397260:6NossGi6JdFiQZbiREuh9F3SzesJrFcLYqrtwG56MQPb - 3105 ms
 9306397267:E675HHi7cczR2sd7eLKxgvEmvMiPCFQMdhVu2CKNERtK - 1226 ms
10306397266:7NbVdPC5XGETJmKXnLYZeHBPHn4f6xwZwckzrG1UNzBH - 1735 ms
11306397264:FaSJyMr4i3Uj5ZmFsjoTqHKb1L1df8L62nLZS9C6Q37E - 2339 ms
12306397265:537NpzEVaVs98m2iU9j3DyWPV48cQZZUmpiY2Bp6wBxd - 2823 ms
13306397270:HPudvUyYiDYQWSXc6qyyjodhBYYnNnT3f7HtiWiwPx8q - 619 ms
14306397268:2pfVswukPpsgQ74bphciMcNGBUAeVvmWeKBwrB1WRY3P - 1309 ms
15306397269:Fk8FbAkAeLUpxRQBvSMMo8cZfkT4XQ4WwjudQBYK72na - 1762 ms
16306397271:912LhRCHxJp84BFdeC8QuBX97xuV57VddzCgm9ZhuXQv - 2072 ms
17306397275:C8C8HQq3JEwJ6wiuRuqPmpy1Q4ch3KF7B72DbUahajdQ - 1358 ms
18306397274:AxbLT7Y1P5AaP5Lm9JVYAA4iz21SdvcXzXWPGGU16unk - 1844 ms
19306397273:41qSJAb4L6vrVTB1uBQ6eUqXS3cMMsZNwCgRTaHGGcuv - 2327 ms
20306397272:7wbGUbbN6RoAbTfRZ5wAeMfjhVDLBfeG3o2Hf8aaWuZA - 2970 ms
21306397277:9wf3uy7fbUinUUgqJgkSop1bkyYQpeu4p4tdMth9L1iV - 1662 ms
22306397276:URddsoYTYvu6znBYQL5xNFfUz13rokKcX37bJY4XqJb - 2207 ms
23306397279:GiEpin5bNr3zDAng51DXDAuooVZTyHb5DNhg2bc94tYP - 2714 ms
24306397278:Etbq6BwPeZSnFW7x9sYkDThwA5ejChfqcPV5JHb7gRNL - 3211 ms
25total: 17803 ms
Page Blocks Total ms Average ms
4 24 17,803 740
24 24 15,838 660
16 240 165,464 690
32 240 158,440 660
48 240 154,446 640

CPU usage was spread between 80% and 130%.

So, a little better at 660 ms, but this shows a symptom with bigger parallelism where at the beginning of the request page, it hits rate limits of the service, and at the end of the page most of the promises are idle, waiting for the last requests, instead of advancing to request for more blocks.
Let us fix this below.

Consumer threads

Without using real OS threads, this is showing the concept, where constantly running threads can pick a block index from common task list, process on its own and advance to the next one.

 1const blocks = [];
 2for (let index = 0; index < 24; ++index) {
 3    blocks.push(306397256 + index);
 4}
 5
 6const thread = async (connection, blocks) => {
 7    while (blocks.length) {
 8        const block = blocks.shift();
 9        await get(connection, block);
10    }
11}
12
13const threads = [];
14for (let index = 0; index < 24; ++index) {
15    threads.push(thread(connection, blocks));
16}
17await Promise.all(threads);

The output.

 1306397270:HPudvUyYiDYQWSXc6qyyjodhBYYnNnT3f7HtiWiwPx8q - 2055 ms
 2306397267:E675HHi7cczR2sd7eLKxgvEmvMiPCFQMdhVu2CKNERtK - 4324 ms
 3306397268:2pfVswukPpsgQ74bphciMcNGBUAeVvmWeKBwrB1WRY3P - 4964 ms
 4306397269:Fk8FbAkAeLUpxRQBvSMMo8cZfkT4XQ4WwjudQBYK72na - 5415 ms
 5306397262:8N1UCXW4bRPZ2CyqoX7bMo1oNbn3TgqLe15UhyifeFGb - 5878 ms
 6306397266:7NbVdPC5XGETJmKXnLYZeHBPHn4f6xwZwckzrG1UNzBH - 6381 ms
 7306397256:38dyuUps2AG6iqW6T8wWjegnrCAtw3kG5JfN2SNCV3YK - 6922 ms
 8306397257:AusWP4vH3VkczWHz4C6K4S8pBZPG7apbyEy1oCr7YFWy - 7386 ms
 9306397263:8NUKaiGuCH7qJgvqAX8XaM7ti3ADhzNAzGjrxfjj61TG - 7947 ms
10306397259:HSXwVhnyzHg5j4LbmWXJdjTqvSZj1P7tUHJFhgeYcb2x - 8551 ms
11306397265:537NpzEVaVs98m2iU9j3DyWPV48cQZZUmpiY2Bp6wBxd - 9039 ms
12306397258:74GUsvi3uuL6ophpknf4K1BpEBQEjEqWSFyTGctiADJC - 9463 ms
13306397260:6NossGi6JdFiQZbiREuh9F3SzesJrFcLYqrtwG56MQPb - 9988 ms
14306397264:FaSJyMr4i3Uj5ZmFsjoTqHKb1L1df8L62nLZS9C6Q37E - 10599 ms
15306397261:F9CxWpv8Us5CGV3foA7H47EFEVzyrGXr4LwHMNft4seP - 11093 ms
16306397271:912LhRCHxJp84BFdeC8QuBX97xuV57VddzCgm9ZhuXQv - 11413 ms
17306397275:C8C8HQq3JEwJ6wiuRuqPmpy1Q4ch3KF7B72DbUahajdQ - 11821 ms
18306397273:41qSJAb4L6vrVTB1uBQ6eUqXS3cMMsZNwCgRTaHGGcuv - 12312 ms
19306397277:9wf3uy7fbUinUUgqJgkSop1bkyYQpeu4p4tdMth9L1iV - 12807 ms
20306397279:GiEpin5bNr3zDAng51DXDAuooVZTyHb5DNhg2bc94tYP - 13327 ms
21306397274:AxbLT7Y1P5AaP5Lm9JVYAA4iz21SdvcXzXWPGGU16unk - 13799 ms
22306397278:Etbq6BwPeZSnFW7x9sYkDThwA5ejChfqcPV5JHb7gRNL - 14313 ms
23306397276:URddsoYTYvu6znBYQL5xNFfUz13rokKcX37bJY4XqJb - 14861 ms
24306397272:7wbGUbbN6RoAbTfRZ5wAeMfjhVDLBfeG3o2Hf8aaWuZA - 15512 ms
25total: 15533 ms
“Threads” Blocks Total ms Average ms
24 24 15,533 647
8 240 121,105 505
16 240 123,711 515
32 240 125,031 521
48 240 127,525 530

This shows better result, and the CPU usage is more stable at around 110% with way smaller deviation.

Real consumer threads

Eventually, the 110% CPU usage shows that we have to use real OS threads to solve this problem.
My laptop CPU i7-1165G7 is unable to keep up with the expected rate while limited to a single core.

For production it is necessary to use OS threads. Node.js has the support with worker threads, and even though it is usually adviced against using multithreading in this ecosystem, multithreading gets the job done and allows enough headroom to go through each transaction individually.