1,665 questions
-4
votes
0
answers
18
views
Did not find local replica set configuration document at startup shard pod kubernetes replicaset is not initialized [closed]
{"t":{"$date":"2025-11-13T16:42:03.966+00:00"},"s":"I", "c":"STORAGE", "id":4795901, "ctx":"...
3
votes
0
answers
101
views
Azure Cosmos: MongoDB sharded cluster: $in query performance degrades with fewer values
I'm facing a counter-intuitive performance issue with my MongoDB sharded cluster where queries with fewer values in an $in clause are significantly slower than queries with more values.
The Issue:
...
1
vote
1
answer
116
views
Sharding a large parquet file in Polars makes the system crash
I have been trying to make shards from .parquet file of size 20GB and my code tends to crash the system by the time it reaches the last shard but sometimes also works. I am quite new to working with ...
0
votes
0
answers
42
views
Why does the first Gremlin query take significantly more time in distributed JanusGraph setup on Kubernetes?
Environment Setup
I'm working with a distributed JanusGraph architecture deployed on Azure Kubernetes Service (AKS):
Infrastructure:
AKS Cluster: 2 nodes (16 vCPU, 64 GB RAM each)
Cassandra: 2 ...
0
votes
1
answer
88
views
Why does my MongoDB sharded cluster have only 10 chunks after importing 716 million documents?
I have a MongoDB (v8.0.9) sharded cluster running in Kubernetes with the following setup:
3 shards, each with 2 replicas
An empty collection was created with a hashed shard key (over a UUID field)
...
1
vote
0
answers
52
views
Application not selecting rows from sharded db
I'm working on a very "high loaded" project (Java, Spring, PostgreSQL, K8s), and we needed to split our DB on partition, and then on shards. There is also a scheduler on each shard, and a ...
-1
votes
1
answer
45
views
How to manage shard mechanism into the tables?
I have a database with 100 million records and 10 tables. Out of these, I’ve applied sharding (based on geo-location) to only 3 tables. The remaining tables are not sharded.
Now, if I want to run a ...
0
votes
1
answer
82
views
Elasticsearch Shard Limit Reached in Magento 2
While running reindexing in our Magento 2 environment, we encountered the following error:
{"error":{"root_cause":[{"type":"validation_exception","reason&...
1
vote
0
answers
40
views
Best Approach to Handle Unsharded Collections in a MongoDB Global Write Cluster?
Thanks for taking the time to look into my issue.
I currently have a MongoDB Atlas global write cluster with three region-specific shards in UK, US, and HK. The unsharded collections are located ...
3
votes
1
answer
140
views
Sharded Discord.js bot resets to initial presence status
I wrote a Discord.js bot in NodeJS that uses discord-hybrid-sharding to spawn my bot.js.
In the bot.js code below, you can see that I have an initial status of "Starting..." and then in ...
0
votes
1
answer
54
views
Per EntityType Active Entity Limit in Cluster Sharding Passivation
I have a clustered system that has a number of different entity types with different memory and computation complexity characteristics and would like to use active-entity-limit in my Passivation ...
0
votes
1
answer
111
views
ClickHouse: copy existing data to a new cluster with a different layout
I have a ClickHouse setup running version 21.8 with 3 shards, none of them are replicated. This setup holds 92 tables occupying approximately 60G of data.
SELECT cluster, shard_num, shard_weight, ...
0
votes
1
answer
343
views
Replication vs Sharding (for read scalability vs write scalability)
Can someone explain why they say db replication is more ideal for read scalability while sharding is more ideal for write scalability?
From my current understanding:
replication allows read traffic ...
0
votes
0
answers
94
views
Efficient method for sharding BigQuery table collection
I would like to ask for advice regarding the following task: assume a collection of BQ tables bearing names with structure name_YYYYMM and containing each a DATETIME type column called date_time whose ...
0
votes
0
answers
148
views
Create read-only shards from a postgres DB
I have an application that consists of a master application+DB and a bunch of edge servers. Each edge server syncs a subset of the master data via custom API calls. I would like to simplify this ...
0
votes
1
answer
177
views
Index based routing allocation not working - Elasticsearch 5.x
As a dependent question to an existing open thread, where I was doubting the strategy or implementation details of settings when modified for a cluster as well as an index. While I had tried to ...
0
votes
1
answer
178
views
Cron job scheduled in spring app to distribute data with all instances
I have a design question
Cron job scheduled in spring app using @scheduled, I have 4 instances and I want the job to run in all the instances by distributing data. Say I need to process 1000 data, ...
1
vote
1
answer
1k
views
Elastic maximum shards open best practics
does anyone knows how can i solve
Error: Validation Failed: 1: this action would add [2] shards, but this cluster currently has [3000]/[3000] maximum normal shards open
i know elasticsearch suggest ...
0
votes
1
answer
178
views
How failures and restore operations in sharding (consistent hashing)
In consistent hashing, suppose we are using username as for hashing
hashFunction(username) = nodeA
Now from what I understand, if there is any failure or a node is removed requests will be directed to ...
0
votes
0
answers
95
views
Unable to connect to mongod instance started with --configsvr option
I am trying to create a mongo sharded cluster. First of all, I want to create a config server replica set with docker compose.
My docker-compose.yml file
version: "3.8"
services:
mongo1:
...
-3
votes
2
answers
123
views
Horizontal scaling strategy with 10,000 shards [closed]
My app has a User collection. Each document in the collection averages about .04 MB. At worst case, a document may slightly exceed .1 MB. Needless to say, these are small documents. However, each ...
0
votes
1
answer
56
views
Assigning a dedicated Primary node for write operations in MongoDB replica set
Distribution of my MongoDB cluster consisting of 15 nodes across 3 different data centers is as listed below:
DataCenter 1:
Router-1
ConfigServer-1
Shard1Node1 [Primary]
Shard2Node2 [Secondary]
...
3
votes
1
answer
3k
views
Why does Elastic search limit the maximum shard number to 1k per node?
"The cluster shard limits prevent creation of more than 1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen node. Make sure you have enough nodes of each type in your ...
0
votes
1
answer
621
views
citus add node --> "fe_sendauth: no password supplied" error
I am trying to setup multi node schema based sharding for postgresql database using citus extension.
I have two azure virtual machines , one is working as worker node('20.40.43.246') and other as ...
0
votes
0
answers
95
views
MongoDB Shard Cluster
Good day people .Am having some trouble adding initiating replicas using the rs.initiate().
below is my docker-compose file
version: '3'
services:
configs1:
container_name: configs1
image: ...
0
votes
0
answers
76
views
When querying a sharded collection, can I filter on shard keys using an operator?
I need to update many documents in a Mongo collection sharded on the _id field. I already have the IDs available in a list. Can I update the documents using something like .update_many({"_id"...
0
votes
1
answer
88
views
Make shard processes use the same pool
I want to create a pool in my shard manager (server.js) and pass it to shard processes (bot.js). Here is my sharding manager (server.js):
var clientMysqlEvent = require('./database/botpool.js')....
0
votes
0
answers
53
views
Total number of docs on a shard in solr
I am facing some issues when trying to understand how solr is generating score for a document on a shard.
When I query the shard using q=*:* params the numFound param returns 25151. Next I give a ...
0
votes
1
answer
181
views
Sharing large partition key in Cassandra: how to keep a fixed shard size?
I read this post on how to deal with large partitions and partitioning hotspots, their solution is to add a sharding key as part of the partition key, and keep the shard size at a fixed size, say 1000....
0
votes
1
answer
643
views
What are the risks of large shards in Elasticsearch?
At my workplace each of our ES indices is configured to have exactly 5 shards and we make no use of the Rollover API or ILM. Most of our indices are quite small, but we have one large index where each ...
0
votes
1
answer
1k
views
How to fix TransientTransactionError in Shard Mongo DB with spring boot?
Scenario:
I am using Mongo DB 6+ version with spring boot.
@Bean
MongoTransactionManager transactionManager(MongoDatabaseFactory dbFactory) {
return new MongoTransactionManager(...
-3
votes
2
answers
200
views
how to create a postgresql database that can storage data from a exceed 10000 columns csv table?
I'm an SQL novice, I have a big data table with attributes exceeding 10000 columns which is hosted by CSV, and those columns come from multiple sites, I tried to import them into a database to manage ...
-1
votes
1
answer
300
views
Sharding multiple tables with no common column
I wanted to understand sharding in case of multiple tables which might be used for QnA websites like Quora/SO. Let's assume that users can ask questions, give answers and comment on both questions and ...
0
votes
1
answer
668
views
Sharding key in clickhouse replication
I'm using clickhouse replication and plan to shard data across shards/nodes. For local replica I want use AggregatingMergeTree engine , so question is should I use some specific sharding key for ...
0
votes
1
answer
912
views
Upgraded Kibana and Elasticsearch to v8.9.2, Kibana is not starting, Error: Not enough active copies to meet shard count of [ALL] (have 1, needed 2)
I recently upgraded Elasticsearch and Kibana to v8.9.2 using Bitnami Helm chart on my Kubernetes AWS EKS cluster. Elasticsearch is running fine with 3 nodes but Kibana is restarting again and again as ...
-1
votes
2
answers
67
views
How to retrieve the data from MySQL database if we are having million of rows in a table without indexing the time_stamp column?
I have faced this scenario in one of my interview. There will one table with millions of records and the table is going to have only two columns id which is primary key and time_stamp which is of type ...
0
votes
0
answers
15
views
Can't able to add clustered node IP in Mongos
I'm a beginner at Mongodb. so I trying to configure the sharding in MongoDB. so I have 2 separate server with Linux centos OS and installed MongoDB 4.2.24 version in both servers so initially there is ...
0
votes
1
answer
212
views
Will the shard database increase the number of database in an Azure Elastic pool?
We are preparing to move our SaaS product(single-tenant-per-database model) to Azure SQL with database sharding, from what I learned, a shard is a database and each elastic pool max contains 100 ...
1
vote
0
answers
109
views
Question on sharding using planetscale (keeping data within a country using sharding)
I'm building a niche social media DB on planetscale that spans users living in multiple countries. Is there a way I can shard my social media user data per country and have that data physically ...
1
vote
0
answers
182
views
Appropriate method to shard a BigQuery table via DBT
I am working with a rather large BigQuery table - let us call it lt - and am considering the possibility of creating a sharded version of it by using DBT. More specifically, I would like to be able to ...
0
votes
0
answers
31
views
MongoDB sharding cluster show wrong size
I created a mongodb cluster according to digitalocean's instructions. I sharded my database with the following command:
sh.shardCollection("database.Collection", {"userId": 1})
...
0
votes
2
answers
194
views
MongoDB sharding creatse only one chunk
I created a mongodb cluster according to digitalocean's instructions. I sharded my database with the following command:
sh.shardCollection("database.Collection", {"userId": 1})
...
1
vote
0
answers
86
views
How many shards is facebook's user table partitioned across?
"Shard Manager manages tens of millions of shards hosted on hundreds of thousands of servers across hundreds of applications in production."
https://engineering.fb.com/2020/08/24/production-...
0
votes
1
answer
2k
views
Index fail cause in Elastic Search
I am working on the Elastic Search (v7.10) and see that the statistic metric "indexing.index_failed" has increased. But I want to know the reasons why it failed.
In my application, I used ...
1
vote
0
answers
340
views
Why does elasticsearch not use index.routing_partition_size as specified? (custom routing)
I'm using custom routing on elasticsearch 8.7.1, and it seems like the routing_partition_size is not being implemented as expected unless I manually specify number_of_routing_shards.
For example, if I ...
-1
votes
1
answer
226
views
Citus-Postgres Custom Distribution Logic
I am working with Citus-Postgress to setup a cluster of coordinator and worker nodes and distribute table data across these nodes. By default, Citus uses its own logic to automatically distribute ...
0
votes
0
answers
146
views
Why relocation of shards doesn't happens when unassigned shards are present in a cluster?
I was reading about relocation of shards in elastic search and allocation of unassigned shards. Came upon this issue - https://github.com/elastic/elasticsearch/issues/12273.
Here it is mentioned that ...
0
votes
0
answers
38
views
Database sharding with replication - delay
We are thinking of sharding our database with replication. Our usecases include reads and writes to parts of shards. We have questions like
How long the delays would be in replication?
Will there be ...
0
votes
0
answers
395
views
Apache SOLR from Version 8.6.0 - Joining between Multiple Collections and Multiple Shards in each collection
Please consider SOLR version greater than 8.6.0 for this query. There are many questions regarding this issue but all are before version 8.6.0 and at that time SOLR was not supporting Join between ...
1
vote
1
answer
205
views
How does NEAR asynchronous actually works?
Based on the documentations, articles and ... here what i understand of NEAR asynchronous. Please correct me if i am wrong:
Due to NEAR asynchronous design and Nightshade algorithm. Transactions (or ...