Should I keep using ULID or should I change to CUID2 for an enterprise software? [closed]
Closed 4 days ago.
How to rebalance data across nodes?
I am implementing a message queue where messages are distributed across nodes in a cluster. The goal is to design a system to be able to auto-scale without needing to keep a global map of each message and its location.
Subscribing to a range of topics per instance of a service
If I have a system where my distributed service is sending live scores of 1000s of football games from some hypothetical event to millions of clients. The service subscribes to the games from a source service that publishes the scores. The source publish happens over the id of the game i.e. each game is published on a different topic that my service subscribes to.
Subscribing to a range of topics per instance of a service
If I have a system where my distributed service is sending live scores of 1000s of football games from some hypothetical event to millions of clients. The service subscribes to the games from a source service that publishes the scores. The source publish happens over the id of the game i.e. each game is published on a different topic that my service subscribes to.
How can a distributed system satisfy CP in CAP theorem?
If a system is partition tolerant, it’s impossible for it to be consistent since there’s no way for one node to update another. How can you be both consistent while partition intolerant be possible?
How do I find frameworks/libraries to explore for my distributed measurement system?
I have been professionally writing code for years, but I have always been working in existing (legacy) code bases. I’m currently building a new system from scratch and I’m finding out that I have a massive blind spot when it comes to existing frameworks/libraries. There is so much out there, that I’m having a hard time finding things that are relevant for me. I guess I’m looking for some pointers or even just the right terminology to google.
how about using kubernetes statefulset to mapping the snowflake datacener id and worker id
I am developing a distribution id project, now using the twitter snowflake id as the fundation of the distribution id. In kubernetes cluster, to fetch the uniq and non-conflict datacenter id and worker id is hard. Now I am using the podop % 32
, as the pod increase, the worker id conflict increase.
Deduplication, Grouping for events table at scale
I’m working with an events table where different source tables trigger writes into this table with columns: entity_id and payload. These events are then published to a Kafka topic using a message relay service. The table is partitioned hourly based on event_time, handling a high scale of ~5M+ rows per hour. After a row is processed and published, we mark it as processed=true and drop partitions after 24 hours to avoid performance issues from deleting individual rows.
Designing a Distributed System for Indigenous Data Sovereignty Across Nations
I’m looking for some quick “back-of-the-napkin” thoughts from systems engineers on the following scenario:
Which node type (A,B,C,D,E), and how many of them, would you choose to run this distributed job, and based on what reasoning? [closed]
Closed 1 min ago.