0

I have an application that consists of a master application+DB and a bunch of edge servers. Each edge server syncs a subset of the master data via custom API calls. I would like to simplify this process by implementing some existing solution for sharding/replication. Some considerations:

  1. The edge servers have bad/unstable connectivity to master and no interconnection is possible between the edge servers.
  2. Can't 100% trust the edge servers, so I don't want to give them full access to master for replicating whatever they want.
  3. Fancy features like multi-master writes or distributed execution of queries between shards are not needed. Each shard should be a fully redundant read-only subset of master.

I've thought about using builtin pg logical replication but am not sure how much work it will be create publications for all the tables with appropriate filters. And it's really too bad that replication doesn't copy table structures, so I would have to keep the structure in sync manually somehow.

The 3rd party solutions I've looked at (Citus, repmgr) don't seem quite right either, considering my 3 points above.

Is there something I'm missing?

2
  • The bad/unstable connectivity to dozens/hundreds of nodes rules out pretty much everything "standard". You could do logical replication to a local bunch of separate instances and then WAL shipping from those to the nodes when they are available... Commented May 15, 2024 at 9:13
  • Thanks for the suggestion @RichardHuxton. The connectivity isn't THAT bad.. Just we must assume some nodes will disconnect randomly and come back after 1-24h. So in that case for sure they would have to do a fresh sync - it's ok but needs to be automated. Commented May 15, 2024 at 9:21

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.