Relative Content

Tag Archive for distributed-system

How to rebalance data across nodes?

I am implementing a message queue where messages are distributed across nodes in a cluster. The goal is to design a system to be able to auto-scale without needing to keep a global map of each message and its location.

Subscribing to a range of topics per instance of a service

If I have a system where my distributed service is sending live scores of 1000s of football games from some hypothetical event to millions of clients. The service subscribes to the games from a source service that publishes the scores. The source publish happens over the id of the game i.e. each game is published on a different topic that my service subscribes to.

Subscribing to a range of topics per instance of a service

If I have a system where my distributed service is sending live scores of 1000s of football games from some hypothetical event to millions of clients. The service subscribes to the games from a source service that publishes the scores. The source publish happens over the id of the game i.e. each game is published on a different topic that my service subscribes to.

How do I find frameworks/libraries to explore for my distributed measurement system?

I have been professionally writing code for years, but I have always been working in existing (legacy) code bases. I’m currently building a new system from scratch and I’m finding out that I have a massive blind spot when it comes to existing frameworks/libraries. There is so much out there, that I’m having a hard time finding things that are relevant for me. I guess I’m looking for some pointers or even just the right terminology to google.

Deduplication, Grouping for events table at scale

I’m working with an events table where different source tables trigger writes into this table with columns: entity_id and payload. These events are then published to a Kafka topic using a message relay service. The table is partitioned hourly based on event_time, handling a high scale of ~5M+ rows per hour. After a row is processed and published, we mark it as processed=true and drop partitions after 24 hours to avoid performance issues from deleting individual rows.