Newest 'sharding' Questions

-4 votes

0 answers

18 views

Did not find local replica set configuration document at startup shard pod kubernetes replicaset is not initialized [closed]

{"t":{"$date":"2025-11-13T16:42:03.966+00:00"},"s":"I", "c":"STORAGE", "id":4795901, "ctx":"...

mister Robot

1

asked Nov 13 at 17:26

3 votes

0 answers

101 views

Azure Cosmos: MongoDB sharded cluster: $in query performance degrades with fewer values

I'm facing a counter-intuitive performance issue with my MongoDB sharded cluster where queries with fewer values in an $in clause are significantly slower than queries with more values. The Issue: ...

9308

31

asked Oct 22 at 15:21

1 vote

1 answer

116 views

Sharding a large parquet file in Polars makes the system crash

I have been trying to make shards from .parquet file of size 20GB and my code tends to crash the system by the time it reaches the last shard but sometimes also works. I am quite new to working with ...

Manu Srivastava

11

asked Sep 9 at 16:18

0 votes

0 answers

42 views

Why does the first Gremlin query take significantly more time in distributed JanusGraph setup on Kubernetes?

Environment Setup I'm working with a distributed JanusGraph architecture deployed on Azure Kubernetes Service (AKS): Infrastructure: AKS Cluster: 2 nodes (16 vCPU, 64 GB RAM each) Cassandra: 2 ...

Ravindra Gupta

642

asked Jun 2 at 15:23

0 votes

1 answer

88 views

Why does my MongoDB sharded cluster have only 10 chunks after importing 716 million documents?

I have a MongoDB (v8.0.9) sharded cluster running in Kubernetes with the following setup: 3 shards, each with 2 replicas An empty collection was created with a hashed shard key (over a UUID field) ...

bigbit

1

asked May 21 at 21:45

1 vote

0 answers

52 views

Application not selecting rows from sharded db

I'm working on a very "high loaded" project (Java, Spring, PostgreSQL, K8s), and we needed to split our DB on partition, and then on shards. There is also a scheduler on each shard, and a ...

denstran

366

asked May 3 at 16:35

-1 votes

1 answer

45 views

How to manage shard mechanism into the tables?

I have a database with 100 million records and 10 tables. Out of these, I’ve applied sharding (based on geo-location) to only 3 tables. The remaining tables are not sharded. Now, if I want to run a ...

Krupesh Patel

49

asked Apr 18 at 11:02

0 votes

1 answer

82 views

Elasticsearch Shard Limit Reached in Magento 2

While running reindexing in our Magento 2 environment, we encountered the following error: {"error":{"root_cause":[{"type":"validation_exception","reason&...

Rakesh parsoya

21

asked Jan 23 at 17:59

1 vote

0 answers

40 views

Best Approach to Handle Unsharded Collections in a MongoDB Global Write Cluster?

Thanks for taking the time to look into my issue. I currently have a MongoDB Atlas global write cluster with three region-specific shards in UK, US, and HK. The unsharded collections are located ...

Shailendra Garg

33

asked Jan 7 at 3:24

3 votes

1 answer

140 views

Sharded Discord.js bot resets to initial presence status

I wrote a Discord.js bot in NodeJS that uses discord-hybrid-sharding to spawn my bot.js. In the bot.js code below, you can see that I have an initial status of "Starting..." and then in ...

NullDev

7,378

asked Oct 2, 2024 at 15:16

0 votes

1 answer

54 views

Per EntityType Active Entity Limit in Cluster Sharding Passivation

I have a clustered system that has a number of different entity types with different memory and computation complexity characteristics and would like to use active-entity-limit in my Passivation ...

Arne Claassen

14.5k

asked Aug 25, 2024 at 18:45

0 votes

1 answer

111 views

ClickHouse: copy existing data to a new cluster with a different layout

I have a ClickHouse setup running version 21.8 with 3 shards, none of them are replicated. This setup holds 92 tables occupying approximately 60G of data. SELECT cluster, shard_num, shard_weight, ...

zzzzzz

15

asked Aug 22, 2024 at 10:36

0 votes

1 answer

343 views

Replication vs Sharding (for read scalability vs write scalability)

Can someone explain why they say db replication is more ideal for read scalability while sharding is more ideal for write scalability? From my current understanding: replication allows read traffic ...

Joshua Choe

1

asked Aug 4, 2024 at 3:58

0 votes

0 answers

94 views

Efficient method for sharding BigQuery table collection

I would like to ask for advice regarding the following task: assume a collection of BQ tables bearing names with structure name_YYYYMM and containing each a DATETIME type column called date_time whose ...

ΑΘΩ

131

asked Jun 4, 2024 at 17:08

0 votes

0 answers

148 views

Create read-only shards from a postgres DB

I have an application that consists of a master application+DB and a bunch of edge servers. Each edge server syncs a subset of the master data via custom API calls. I would like to simplify this ...

Philon

142

asked May 15, 2024 at 7:28

0 votes

1 answer

177 views

Index based routing allocation not working - Elasticsearch 5.x

As a dependent question to an existing open thread, where I was doubting the strategy or implementation details of settings when modified for a cluster as well as an index. While I had tried to ...

Naman

32.7k

asked May 13, 2024 at 19:34

0 votes

1 answer

178 views

Cron job scheduled in spring app to distribute data with all instances

I have a design question Cron job scheduled in spring app using @scheduled, I have 4 instances and I want the job to run in all the instances by distributing data. Say I need to process 1000 data, ...

Suhashini Lokesh

1

asked May 7, 2024 at 5:45

1 vote

1 answer

1k views

Elastic maximum shards open best practics

does anyone knows how can i solve Error: Validation Failed: 1: this action would add [2] shards, but this cluster currently has [3000]/[3000] maximum normal shards open i know elasticsearch suggest ...

Hadii Varposhti

424

asked Apr 28, 2024 at 8:16

0 votes

1 answer

178 views

How failures and restore operations in sharding (consistent hashing)

In consistent hashing, suppose we are using username as for hashing hashFunction(username) = nodeA Now from what I understand, if there is any failure or a node is removed requests will be directed to ...

Disha Gupta

47

asked Apr 26, 2024 at 7:59

0 votes

0 answers

95 views

Unable to connect to mongod instance started with --configsvr option

I am trying to create a mongo sharded cluster. First of all, I want to create a config server replica set with docker compose. My docker-compose.yml file version: "3.8" services: mongo1: ...

biryukvy

1

asked Apr 15, 2024 at 17:17

-3 votes

2 answers

123 views

Horizontal scaling strategy with 10,000 shards [closed]

My app has a User collection. Each document in the collection averages about .04 MB. At worst case, a document may slightly exceed .1 MB. Needless to say, these are small documents. However, each ...

Bear Bile Farming is Torture

5,433

asked Mar 23, 2024 at 7:44

0 votes

1 answer

56 views

Assigning a dedicated Primary node for write operations in MongoDB replica set

Distribution of my MongoDB cluster consisting of 15 nodes across 3 different data centers is as listed below: DataCenter 1: Router-1 ConfigServer-1 Shard1Node1 [Primary] Shard2Node2 [Secondary] ...

Ahmet Burak

11

asked Mar 21, 2024 at 15:33

3 votes

1 answer

3k views

Why does Elastic search limit the maximum shard number to 1k per node?

"The cluster shard limits prevent creation of more than 1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen node. Make sure you have enough nodes of each type in your ...

Bingfeng

317

asked Mar 21, 2024 at 1:08

0 votes

1 answer

621 views

citus add node --> "fe_sendauth: no password supplied" error

I am trying to setup multi node schema based sharding for postgresql database using citus extension. I have two azure virtual machines , one is working as worker node('20.40.43.246') and other as ...

srinivast6

347

asked Mar 20, 2024 at 9:42

0 votes

0 answers

95 views

MongoDB Shard Cluster

Good day people .Am having some trouble adding initiating replicas using the rs.initiate(). below is my docker-compose file version: '3' services: configs1: container_name: configs1 image: ...

joshua

28

asked Mar 11, 2024 at 18:02

0 votes

0 answers

76 views

When querying a sharded collection, can I filter on shard keys using an operator?

I need to update many documents in a Mongo collection sharded on the _id field. I already have the IDs available in a list. Can I update the documents using something like .update_many({"_id"...

James Kelleher

2,177

asked Feb 15, 2024 at 15:14

0 votes

1 answer

88 views

Make shard processes use the same pool

I want to create a pool in my shard manager (server.js) and pass it to shard processes (bot.js). Here is my sharding manager (server.js): var clientMysqlEvent = require('./database/botpool.js')....

Hasan Kayra

41

asked Jan 30, 2024 at 13:09

0 votes

0 answers

53 views

Total number of docs on a shard in solr

I am facing some issues when trying to understand how solr is generating score for a document on a shard. When I query the shard using q=*:* params the numFound param returns 25151. Next I give a ...

shshnk

1,691

asked Jan 9, 2024 at 18:17

0 votes

1 answer

181 views

Sharing large partition key in Cassandra: how to keep a fixed shard size?

I read this post on how to deal with large partitions and partitioning hotspots, their solution is to add a sharding key as part of the partition key, and keep the shard size at a fixed size, say 1000....

Jinsong Li

7,578

asked Jan 7, 2024 at 12:24

0 votes

1 answer

643 views

What are the risks of large shards in Elasticsearch?

At my workplace each of our ES indices is configured to have exactly 5 shards and we make no use of the Rollover API or ILM. Most of our indices are quite small, but we have one large index where each ...

Nick

184

asked Dec 15, 2023 at 19:29

0 votes

1 answer

1k views

How to fix TransientTransactionError in Shard Mongo DB with spring boot?

Scenario: I am using Mongo DB 6+ version with spring boot. @Bean MongoTransactionManager transactionManager(MongoDatabaseFactory dbFactory) { return new MongoTransactionManager(...

sub

709

asked Dec 5, 2023 at 23:47

-3 votes

2 answers

200 views

how to create a postgresql database that can storage data from a exceed 10000 columns csv table?

I'm an SQL novice, I have a big data table with attributes exceeding 10000 columns which is hosted by CSV, and those columns come from multiple sites, I tried to import them into a database to manage ...

Kenneth

5

asked Dec 1, 2023 at 3:06

-1 votes

1 answer

300 views

Sharding multiple tables with no common column

I wanted to understand sharding in case of multiple tables which might be used for QnA websites like Quora/SO. Let's assume that users can ask questions, give answers and comment on both questions and ...

MikeRob

1

asked Nov 18, 2023 at 20:15

0 votes

1 answer

668 views

Sharding key in clickhouse replication

I'm using clickhouse replication and plan to shard data across shards/nodes. For local replica I want use AggregatingMergeTree engine , so question is should I use some specific sharding key for ...

Alexandr

101

asked Nov 6, 2023 at 23:17

0 votes

1 answer

912 views

Upgraded Kibana and Elasticsearch to v8.9.2, Kibana is not starting, Error: Not enough active copies to meet shard count of [ALL] (have 1, needed 2)

I recently upgraded Elasticsearch and Kibana to v8.9.2 using Bitnami Helm chart on my Kubernetes AWS EKS cluster. Elasticsearch is running fine with 3 nodes but Kibana is restarting again and again as ...

Abdullah Khawer

5,920

asked Oct 9, 2023 at 17:48

-1 votes

2 answers

67 views

How to retrieve the data from MySQL database if we are having million of rows in a table without indexing the time_stamp column?

I have faced this scenario in one of my interview. There will one table with millions of records and the table is going to have only two columns id which is primary key and time_stamp which is of type ...

Bhuvaneshkumar J

529

asked Sep 28, 2023 at 6:50

0 votes

0 answers

15 views

Can't able to add clustered node IP in Mongos

I'm a beginner at Mongodb. so I trying to configure the sharding in MongoDB. so I have 2 separate server with Linux centos OS and installed MongoDB 4.2.24 version in both servers so initially there is ...

Aravind rajamani

5

asked Sep 25, 2023 at 12:21

0 votes

1 answer

212 views

Will the shard database increase the number of database in an Azure Elastic pool?

We are preparing to move our SaaS product(single-tenant-per-database model) to Azure SQL with database sharding, from what I learned, a shard is a database and each elastic pool max contains 100 ...

Leon

354

asked Sep 14, 2023 at 13:42

1 vote

0 answers

109 views

Question on sharding using planetscale (keeping data within a country using sharding)

I'm building a niche social media DB on planetscale that spans users living in multiple countries. Is there a way I can shard my social media user data per country and have that data physically ...

Rick David

23

asked Sep 11, 2023 at 7:45

1 vote

0 answers

182 views

Appropriate method to shard a BigQuery table via DBT

I am working with a rather large BigQuery table - let us call it lt - and am considering the possibility of creating a sharded version of it by using DBT. More specifically, I would like to be able to ...

ΑΘΩ

131

asked Aug 20, 2023 at 20:01

0 votes

0 answers

31 views

MongoDB sharding cluster show wrong size

I created a mongodb cluster according to digitalocean's instructions. I sharded my database with the following command: sh.shardCollection("database.Collection", {"userId": 1}) ...

poroster8

1

asked Aug 2, 2023 at 10:23

0 votes

2 answers

194 views

MongoDB sharding creatse only one chunk

I created a mongodb cluster according to digitalocean's instructions. I sharded my database with the following command: sh.shardCollection("database.Collection", {"userId": 1}) ...

poroster8

1

asked Aug 1, 2023 at 11:02

1 vote

0 answers

86 views

How many shards is facebook's user table partitioned across?

"Shard Manager manages tens of millions of shards hosted on hundreds of thousands of servers across hundreds of applications in production." https://engineering.fb.com/2020/08/24/production-...

Bear Bile Farming is Torture

5,433

asked Jul 26, 2023 at 18:16

0 votes

1 answer

2k views

Index fail cause in Elastic Search

I am working on the Elastic Search (v7.10) and see that the statistic metric "indexing.index_failed" has increased. But I want to know the reasons why it failed. In my application, I used ...

phuc16102001

431

asked Jul 19, 2023 at 3:49

1 vote

0 answers

340 views

Why does elasticsearch not use index.routing_partition_size as specified? (custom routing)

I'm using custom routing on elasticsearch 8.7.1, and it seems like the routing_partition_size is not being implemented as expected unless I manually specify number_of_routing_shards. For example, if I ...

WorkingMeasurement

11

asked Jul 10, 2023 at 10:53

-1 votes

1 answer

226 views

Citus-Postgres Custom Distribution Logic

I am working with Citus-Postgress to setup a cluster of coordinator and worker nodes and distribute table data across these nodes. By default, Citus uses its own logic to automatically distribute ...

Gagandeep Singh

937

asked Jul 6, 2023 at 20:16

0 votes

0 answers

146 views

Why relocation of shards doesn't happens when unassigned shards are present in a cluster?

I was reading about relocation of shards in elastic search and allocation of unassigned shards. Came upon this issue - https://github.com/elastic/elasticsearch/issues/12273. Here it is mentioned that ...

SHASHANK AGRAWAL

19

asked Jun 30, 2023 at 11:20

0 votes

0 answers

38 views

Database sharding with replication - delay

We are thinking of sharding our database with replication. Our usecases include reads and writes to parts of shards. We have questions like How long the delays would be in replication? Will there be ...

Krishna Santosh Nidri

440

asked Jun 29, 2023 at 18:12

0 votes

0 answers

395 views

Apache SOLR from Version 8.6.0 - Joining between Multiple Collections and Multiple Shards in each collection

Please consider SOLR version greater than 8.6.0 for this query. There are many questions regarding this issue but all are before version 8.6.0 and at that time SOLR was not supporting Join between ...

Chirag Shah

363

asked Jun 29, 2023 at 6:40

1 vote

1 answer

205 views

How does NEAR asynchronous actually works?

Based on the documentations, articles and ... here what i understand of NEAR asynchronous. Please correct me if i am wrong: Due to NEAR asynchronous design and Nightshade algorithm. Transactions (or ...

mlibre

2,590

asked Jun 25, 2023 at 14:58

Collectives™ on Stack Overflow