5

I am having a tweet file

396124436845178880,"When's 12.4k gonna roll around",Matty_T_03
396124437168537600,"I really wish I didn't give up everything I did for you.     I'm so mad at my self for even letting it get as far as it did.",savava143
396124436958412800,"I really need to double check who I'm sending my     snapchats to before sending it 😩😭",juliannpham
396124437218885632,"@Darrin_myers30 I feel you man, gotta stay prayed up.     Year is important",Ful_of_Ambition
396124437558611968,"tell me what I did in my life to deserve this.",_ItsNotBragging
396124437499502592,"Too many fine men out here...see me drooling",LolaofLife
396124437722198016,"@jaiclynclausen will do",I_harley99

I am trying to replace all special character after reading file into RDD,

    val fileReadRdd = sc.textFile(fileInput)
    val fileReadRdd2 = fileReadRdd.map(x => x.map(_.replace(","," ")))
    val fileFlat = fileReadRdd.flatMap(rec => rec.split(" "))

I am getting following error

Error:(41, 57) value replace is not a member of Char
    val fileReadRdd2 = fileReadRdd.map(x => x.map(_.replace(",","")))

2 Answers 2

4

I suspect:

x => x.map(_.replace(",",""))

is treating your string as a sequence of characters, and you actually want

x => x.replace(",", "")

(i.e. you don't need to map over the 'sequence' of chars)

Sign up to request clarification or add additional context in comments.

2 Comments

Thank Brian. val stripCurly = "[{~,!,@,#,$,%,^,&,*,(,),_,=,-,`,:,',?,/,<,>,.}]" val fileReadRdd2 = fileReadRdd.map(x => stripCurly.replaceAll(x,""))
but this worked for me val removeDots =file.map(x=>x.replace(".","")) for a file with multiple lines
0

The Perl's oneliner perl -pi 's/\s+//' $file in a regular file system would look as follows in spark scala on any spark supported file system ( feel free to adjust your regex ) :

// read the file into rdd of strings
val rdd: RDD[String] = spark.sparkContext.textFile(uri)

// for each line in rdd apply pattern and save to file
rdd
  .map(line => line.replaceAll("^\\s+", ""))
  .saveAsTextFile(uri + ".tmp")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.