-3

I have a JavaRDD object and want to create another new JavaRDD object by selecting a substring of the original one. How to achieve that?

 // Read input_train data
    logger.info("start to read file");
    JavaRDD<String> inputDataRaw= sc.textFile(input_train);

There inputDataRaw.first() is something like: "apple1; apple2;" (say String s1)

I want to JavaRDD with each line consisting of "apple1" only, i.e.,:

  String s2 = s1.substring(0, 6)
2
  • use rdd.map() - also check the docs. This is a basic operation on Spark: spark.apache.org/docs/latest Commented Apr 20, 2015 at 16:47
  • @maasg thanks for the pointer. can you provide more details? Commented Apr 20, 2015 at 16:54

3 Answers 3

0
 JavaRDD<String> inputDataRaw= sc.textFile(input_train);
 inputDataRaw.new Function<String>() {
 public String call(String arg0) throws Exception {
     return arg0.substring(0,6);
 }
 });
Sign up to request clarification or add additional context in comments.

Comments

0

Below is the simple option. I included the newer JDK8 lambda syntax as well as the older JDK6 compatible syntax:

    JavaRDD<String> inputDataRaw = sc.textFile("file.txt");

    JavaRDD<String> mapped_jdk8 = inputDataRaw.map(s -> s.substring(0, 6));

    JavaRDD<String> mapped_jdk6 = inputDataRaw.map(new Function<String, String>() {
        @Override
        public String call(String s) throws Exception {
            return s.substring(0, 6);
        }
    });

1 Comment

0

I think substring is not a good idea to grab the first object from a line.

substring(0,6) # this will help only when first object is of fixed size.

Instead first split the line with ; (comma) and grab the first index

JavaRDD<String> inputDataRaw = sc.textFile("file.txt");

JavaRDD<String> mapped_jdk8 = inputDataRaw.map(s -> s.split(";")).map(r -> r(0)); 

try r[0] if you get any syntax error in java, I've not tried lambda in java but i do scala only

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.