getProgramAccounts is dead, long live the getProgramAccounts!

ChainCrunch Labs
5 min readApr 8, 2022

There have been a lot of discussions recently about getProgramAccounts, why it’s “toxic”, and what the developers should do about it. In this article, we will go deep into the issue, how it should be fixed, and how the index can help.

What is getProgramAccounts and why do we need it?

Solana RPC provides multiple methods to fetch the current state of the network. These methods are the building blocks of creating a dApp on Solana. Some of them are very simple like getAccountInfo which fetches the data for a single account, while some of them are more complex like the getProgramAccounts. In summary, the getProgramAccounts method fetches a subset of accounts owned by a specific program based on some filtering criteria. Ok, that was very technical! Let’s explain using a simple example. The Token standard on Solana defines a straightforward mechanism to create new tokens (like DevCoin!) and send them to different wallets. Of course, this information needs to be stored in the accounts. The program responsible for managing this information is the Token program and this is the actual structure for it:

Structure of token accounts in Solana

For example, if you have 10 USDC on Solana, you have a token account with the following information:

  • Mint: EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v (USDC Mint)
  • Owner: Your wallet address
  • Amount: 10.000000 (For USDC mint amount is stored with 6 digits of decimal for more precision)
  • … and the rest of the fields

So far, there are more than 107 million of these accounts on Solana! That is more than 17 GB on the blockchain! Now let’s say you want to find all the token accounts which belong to you. You have to look for all the accounts that store your wallet address in the Owner field and that is exactly what you ask in the RPC request! Here is an example:

  {
"method": "getProgramAccounts",
"params": [
"TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA",
{
"encoding": "jsonParsed",
"filters": [
{
"dataSize": 165
},
{
"memcmp": {
"offset": 32,
"bytes": "vines1vzrYbzLMRdu58ou5XTby4qAqVRLmqo36NKPTg"
}
}
]
}
]
}

In the beginning, we said that getProgramAccounts method fetches a subset of accounts owned by a specific program based on some filtering criteria. Let’s find this information in our RPC request. Here we want to search for accounts that are managed by the TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA program (Solana Token standard program) and we have two filtering criteria:

  • The size of the account should be 165, which is the size of token accounts. There are other accounts with different sizes for storing different kinds of information which we don’t need here.
  • We only need the accounts with a specific address stored in offset 32 (start of the Owner field). This is accomplished by the memcmp (short for memory compare) filter.

That’s it! That’s how you can get all your token accounts in Solana. Now suppose you want to find all of the token accounts for a specific mint. All you have to do is to change the filtering criteria:

{
"jsonrpc": "2.0",
"id": 1,
"method": "getProgramAccounts",
"params": [
"TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA",
{
"encoding": "jsonParsed",
"filters": [
{
"dataSize": 165
},
{
"memcmp": {
"offset": 0,
"bytes": "TESTpKgj42ya3st2SQTKiANjTBmncQSCqLAZGcSPLGM"
}
}
]
}
]
}

As you can see, the getProgramAccounts is a very versatile method that can help us access information in an efficient manner.

Why is it slow?

It’s not slow! It’s as fast as it can be! The RPC servers don’t know about the programs that are being deployed and their GPA access patterns. So, the best it can do is to check each account and see if the filtering criteria are accepted or not. The complexity of this procedure is linear in terms of how many accounts should be scanned. For some specific and popular access patterns like the one mentioned above, RPC servers have some configurations which enable faster retrieval by maintaining an extra indexing layer on top of the raw accounts data, but that is the exception (with a lot of extra memory usage and processing power requirements), not the rule. Now if your program has a few thousand accounts, you won’t notice anything cause it’s just super fast. But for programs like Metaplex Metadata which has 60+ million accounts, all hell will break loose. That’s the main reason that this procedure call is heavy and notorious for RPC providers!

What you can do?

The replacement for this is the Geyser plugins. Basically, a geyser plugin can listen to all the account changes and store what is relevant for your deployed programs. For example, a plugin that stores and indexes the accounts into a PostgreSQL database. Then you should also write an API that provides this data to your dApp frontend.

Here is a short summary of what you have to do:

  • Rent a beefy server to run a validator on top of it (the cheapest servers are around $300/month, a decent one around $1k/month)
  • Learn how to run a validator, how to tune your server, how to monitor it, etc.
  • Write a Geyser plugin in Rust, and attach it to your validator
  • Write an API service that exposes the data to your dApp again
  • Maintain all the extra burden as your app grows
Devs after finding out the getProgramAccounts is super slow and all the extra work!

But this is toooo much extra work to do. It can easily delay your product launch for a month or two and you may need to hire DevOps and backend developers. That’s when the index can help!

A Better alternative

We were building on top of the Geyser plugin for 6+ months now. We saw the issues of getProgramAccounts from day one and we did a lot of glass chewing to fix them!

The first message a Solana co-founder sent us!
The first message the Solana co-founder sent us! Way before the plugin was renamed to Geyser!

The index's mission is to help developers in building cool dApps without worrying about the infrastructure. As a first step, we are providing a hyper-fast getProgramAccounts RPC. We did all the stuff we mentioned so you don’t have to do it. We created a scalable system, in which we can add arbitrary indexes for faster fetching on getProgramAccounts. Currently, we support multiple scenarios (fetching staking accounts, NFT collection hashlists,...) and we are expanding our indices continuously. One of the most popular use cases of getProgramAccounts at the moment is to find an NFT collection based on their creators. On a normal RPC service this would take about a minute but on our nodes, this would take less than 5 seconds! Check out theindex.io for more information. #solanaspeed

--

--