1

I have a cache file containing text and paths to Linux files. I would like to extract these files using Linux regex, but I'm not sure how to do it. Here is a sample of what the cache file looks like:

/usr/bin/mk_cmds (not prelinkable)
/usr/bin/gcov:
/lib/libc-2.5.so [0xfff88e55]
    /lib/ld-2.5.so [0x7e786fcc]
/usr/lib/rpm/rpmdeps:
    /usr/lib/librpmbuild-4.4.so [0xdb141354]
    /usr/lib/librpm-4.4.so [0x4d8c8840]

Now here is what I would like to extract:

/usr/bin/mk_cmds
/usr/bin/gcov
/lib/libc-2.5.so
/lib/ld-2.5.so
/usr/lib/rpm/rpmdeps
/usr/lib/librpmbuild-4.4.so
/usr/lib/librpm-4.4.so

I tried a few things but none of them work (using grep):

^(.*/)?(?:$|(.+?)(?:(\.[^.]*$)|$))

'(\/.+?) '

Do you have any idea how I could do it? I have tried a few things but nothing worked. Thank you very much

4
  • I suggest you add more tags that are relative to your question, like regex as well as the way you use to extract them (bash, python etc..) Commented Jul 3, 2020 at 8:17
  • @Armion : What flags did you pass to grep? grep can handle 3 different kinds of regular expressions. For instance, the + you are using, would not work with grep's basic regular expressions. See the options -E and -P in grep. Commented Jul 3, 2020 at 10:47
  • Do any of the file paths have spaces in them? Commented Jul 3, 2020 at 12:30
  • With the assumptions that your pathnames begin with a / character and don't contain a white space or : character: grep -o '/[^[:space:]:]*' cachefile Commented Jul 4, 2020 at 2:09

2 Answers 2

1

with:

sed -n 's/^[[:space:]]*\(.\+\)[: ]/\1/p' cachefile.txt

sed -n: Sed editor in no print mode

  • s/: Search the regex:
  • ^[[:space:]]*: Search lines starting either with spaces or nothing
  • \(.\+\): Capture 1 or more characters.
  • [: ]: Followed by a colon : or a space .
  • /\1/p: Print the Regex captured group 1.

Test and play with this Regex in regex101.com:

https://regex101.com/r/lFzvYq/2

Sign up to request clarification or add additional context in comments.

Comments

1

Try

sed -n '/:$/{s/:$//;p}; /]$/{s/^ *\(.*\) \[0x[0-9a-f]*\]$/\1/;p}'

This assumes that there are only two kinds of required lines in the cache. The ones ending with : and the ones ending with ].

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.