0

Given:

"kuku": "kdfjsfgsljfddnlfdsf"
"bubu": "slfjsdjlkfndvsdl;nsdf;vlankvdfs;lkndkfv"
"title": "dflkbjvndjlvbdknbdlkbvjndlkfdnbdlkbjdnb"
"tutu": "svfrol"
"lala": "dbd4431"
"title": "dfvbdfv"

I want to extract all the rows starting with "title".

Please advise how can I do this (I am using R and want a regex).

I am trying this:

(["'])(title)\1: 

and trying to play with it.

2
  • Please post reproducible example using dput Commented May 5, 2019 at 9:07
  • The input is just text file @RonakShah Commented May 5, 2019 at 9:09

2 Answers 2

2

One option is to read the text file as dataframe in R

df <- read.table(text = '"kuku": "kdfjsfgsljfddnlfdsf"
       "bubu": "slfjsdjlkfndvsdl;nsdf;vlankvdfs;lkndkfv"
       "title": "dflkbjvndjlvbdknbdlkbvjndlkfdnbdlkbjdnb"
       "tutu": "svfrol"
       "lala": "dbd4431"
       "title": "dfvbdfv"', sep = ":", stringsAsFactors = FALSE, strip.white = TRUE)

and then select rows which has first column starting with "title"

df[grepl("^title", df$V1), ]

#     V1                                      V2
#3 title dflkbjvndjlvbdknbdlkbvjndlkfdnbdlkbjdnb
#6 title                                 dfvbdfv

If you want it as original string and not different columns then you can paste them back

do.call(paste, c(df[grepl("^title", df$V1), ], sep = ":"))
#[1] "title:dflkbjvndjlvbdknbdlkbvjndlkfdnbdlkbjdnb" "title:dfvbdfv"    
Sign up to request clarification or add additional context in comments.

7 Comments

Very elegant solution!
Please advise where the sep came from? I didn't see it in the c() documentation? I do understand what you are doing here but this last point.
@SteveS that's an argument for paste so that we get two strings separated by :.
yes, In do.call any additional arguments to the function which you are running can be passed in c(). Similar to sep, you can also do it with collapse like do.call(paste, c(df[grepl("^title", df$V1), ], collapse = ":"))
it's written under ?do.call Check #Examples section ## if we already have a list (e.g., a data frame) ## we need c() to add further arguments
|
2

You can use something like:

^"title":.*$

With your input using https://regex101.com it looks like this:

enter image description here

Explanation:

^ means start of line

"title": is just take literally

. means an arbitrary character

* means it can happen zero or more times

$ means end of line

2 Comments

. means an arbitrary character and * that the arbitrary character can happen zero or more times
I have tried without ^ and it worked...maybe Atom doesn't recognize it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.