dTON LiteServers High Availability transition, part 2

dTON LiteServers High Availability transition, part 2

Our company was the first to start selling shared resources to LiteServers of TON due to our custom TON nodes changes.

Now when more than a year has passed, and we have migrated to HA Kubernetes - we would like to share our experience of how we manage TON servers.

What our infrastructure looks like:

  • 1 entry point network balancer
  • TON Node Balancers:
    • Archive
    • Testnet
    • Mainnet
    • Private (for dedicated clients)
  • TON Nodes

To ensure that our clients always stay up-to-date, and we can roll updates to TON Node balancers, we use a single external balancer that checks the availability of TCP connections and does an automatic failover to live TON Node balancers.

Each TON Node Balancer has prometheus metrics that provide information to Grafana about balancer usage, and we receive critical alerts in Telegram in case something went wrong.

Also, TON Node Balancer has its own balancer-tag. This tag is needed to discover TON Nodes inside Kubernetes cluster automatically. This allows us to add and reduce TON nodes behind the balancer smoothly, as each node sends its credentials to its balancer-tag, and the balancer picks up these credentials and connects to the TON node.

The main task of TON Node Balancer is to quickly execute LiteQuery clients. For this purpose we use caching, as well as find the node that received blocks before all others and proxy requests to it.

In case the client needs to send an external message to the network we retransmit it to all TON Nodes that are available for this balancer, as well as send this message to Kafka, where it is further retransmitted to all third-party services.

Each TON Node also has prometheus metrics endpoint for Grafana and alerts, liveness probes and a batch of other c++ patches e.g. reduce of external-message check and retranslation on high usage.

Basically, we created a stable and perfect infrastructure for our clients, you can check it by using HA_LITESERVER promo code on our API usage inside @dtontech_bot

And if you want private dedicated stable solution: contact us.

@dtontech

Subscribe to dTON

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe