1

I've been trying to do a check for result dataset in spark of whether it is empty or has data. I did below following things.

dataset.rdd().isEmpty();

2.

try{
           dataset.head(1)
         }catch(Exception e){
          status ="No data";
          }

3.

try{
         dataset.first();
          }catch(Exception e){  
           status ="No data";
          }

4.

dataset.limit(1).count()>0;

All this are taking a lot of time to complete when comparatively huge data is present. I need to get a efficient solution for this.

2
  • Yeah . @philantrovert but That's what I have said in the Question description. That I have tried all those?? But evrything is taking a lot of time. when dataset is not empty. Commented Jun 8, 2018 at 6:16
  • These are the options you get. If dataset has complex wide dependencies, then taking even a single element will be expensive. Commented Jun 8, 2018 at 9:35

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.