I'm working on an interesting question about answering range h-index queries.
Fyi h-index is the max integer h that an author has published at least h papers that each are cited at least h times.
You are given N papers and the i-th paper has been cited p[i] times. There will be Q questions, each giving a range [l, r] and asks what would be the h-index of an author that only publishes the papers from index l to r inclusive. N, Q, and all p[i] are between 1 and 2*10^5.
Here is a example input
4 2
5 7 4 1
2 3
3 4
The output should be
2
1
Here p=[5, 7, 4, 1], a query on range [2, 3] should give 2 because the 2 papers in that range have each been cited at least twice (7 and 4 times specifically). Query on [3, 4] should give 1.
Doing binary search on the subarray of each query is O(QNlogN). This code is my attempt at solving the problem. Unfortunately, it gives the right answers but it exceeds the 3 second time limit. The time limit is for the execution of the entire program, so this includes the time to read input. I did also try with scanf and cin.tie(0);ios::sync_with_stdio(0); to speed up i/o but that still exceeds the time limit.
#include <iostream>
int p[200005];
int main() {
using namespace std;
int n;
cin >> n;
int q;
cin >> q;
for (int i = 0; i < n; i++) {
cin >> p[i + 1];
}
for (int i = 0; i < q; i++) {
int l;
cin >> l;
int r;
cin >> r;
int lo = 1, hi = n;
while (lo <= hi) {
int mid = (lo + hi) / 2;
int cnt = 0;
for (int j = l; j <= r; j++) {
if (p[j] >= mid) {
cnt++;
}
}
if (cnt >= mid) {
lo = mid + 1;
} else {
hi = mid - 1;
}
}
cout << hi << '\n';
}
}
My other idea was using a segment tree but idk how to do the merging of two segment tree nodes faster than linear time. The problem here is still mainly the time complexity, I'm looking for a faster algorithm and based on the constraints I believe the complexity per query should definitely be better than O(N).
I'm stumped, what is a faster way to solve this within the time limit?
std::binary_searchfrom<algorithm>?std::vectoris also likely a better option that you C-style array as a global.binary_searchhas a siblinglower_boundthat returns a position, instead of a yes/no.mid) returns true for without actually computing the function for every integer in the range (which would make it much slower)