4

I have strings that initially contain different directory paths, where both the 2nd and 2nd last sub-directories can vary in length, like so

 /home/Leo/Work/CMI/ARCH/MWS/Disks
 /home/Cleo/Work/CMI/ARCH/BK/Disks

I want to trim the first 5 sub-directories and only show the last 2, like so

 echo "/MWS/Disks"
 echo "/BK/Disks"

One way to trim the first 5 sub-directories from the initial strings might be to left-shift each character until both strings start with the second last '/'.

The Bash Beginners Guide describes a shift built-in that left-shifts positional parameters in a command and throws away unused arguments. But it is not immediately obvious whether this could be used to trim the first 5 sub-directories from the strings described above.

In Bash, how do I reduce these strings, preferably without using loops ?


CLARIFICATION

Judging from comments a bit more context is needed. My Bash script recovers historic Mdos and Qdos files from 8-inch floppy disk images and saves files to directories on the hard drive.

For better or worse, I created a bespoke scheme that stores directory paths using 3-character variable names where each name is an acronym for the section of the path to the current directory.

For example MWC is an acronymn for $MY/Work/CMI in the following path

MY="$USER"
MWC="C:/cygwin64/home/$MY/Work/CMI"
cd "$MWC"
pwd
C:/cygwin64/home/$MY/Work/CMI

Similarly 3-character variables point to the next sub-directory further up the tree

WCA="$MWC/ARCH"

i.e. C:/cygwin64/home/$MY/Work/CMI/ARCH, path to a gallery of archive owners.

As directory paths lengthen the 3-character variables make paths easily identified by conserving white space in the listing. Nevertheless the full path appears whenever my script references a path. Hence the need to trim parts of the string that have no interest for the end user.

4
  • Do you have a possibility to use sed? echo /home/Leo/Work/CMI/ARCH/MWS/Disks | sed 's#^\(/[^/]*\)\{5\}##' Commented Nov 6, 2021 at 22:12
  • yes though this example didn't work Commented Nov 6, 2021 at 22:42
  • I tried it online Commented Nov 6, 2021 at 22:57
  • how are you processing this list of directories/files? are they coming from a file? a stream from a separate OS process? one-by-one via a variable (eg, in a loop)? Commented Nov 6, 2021 at 23:11

4 Answers 4

9

If the number of subdirectories is always the same, you can use parameter expansion to remove the first 5 subdirectories:

s=/home/Leo/Work/CMI/ARCH/MWS/Disks
s=/${s#/*/*/*/*/*/}
echo $s  # /MWS/Disks

Or, if you know you need the last two parts whatever the depth of the path is:

s=/home/Leo/Work/CMI/ARCH/MWS/Disks
last=/${s##*/}
last_but1=${s%$last}
last_but1=/${last_but1##*/}
echo $last_but1$last  # /MWS/Disks
  • ${s#PATTERN} removes PATTERN from the beginning of $s.
  • ${s%PATTERN} removes PATTERN form the end of $s.
  • with # or %, the shortest match of PATTERN is found. Doubling them makes the match the longest possible.
Sign up to request clarification or add additional context in comments.

7 Comments

the last two parts are needed so the user (named in the 2nd sub-directory) knows whose disk archive they are working on (i.e. the archive owner named in the 2nd last sub-directory) - if that makes sense.
@Greg: I rephrased the comments to explain what the difference between the two approaches is. Both of them would probably work for you.
This is good. I'm checking each. Do you have a reference on parameter expansion ?
man bash describes it a bit tersely, so experiment with it to understand the nuances :-)
You're dead right. It took a bit of experimentation to see how this answer addressed the question. There's no way I would have got there by reading man bash. The three bullet-points really helped.
|
3

As an alternative to the parameter expansion, you can use the =~ operator:

dir='/home/Leo/Work/CMI/ARCH/MWS/Disks'
[[ $dir =~ /[^/]*/[^/]*$ ]] && echo "${BASH_REMATCH[0]}"

3 Comments

I not sure what you changed but this worked both before and after the edit. Can you recommend a reference on the =~ operator ?
@Greg Bash Reference Manual, Conditional constructs. Search for the =~ in the page.
1

Assuming the inputs are coming from a file (or streamed/piped from another OS process) ...

Sample input:

$ cat dir.file
 /home/Leo/Work/CMI/ARCH/MWS/Disks
 /home/Cleo/Work/CMI/ARCH/BK/Disks

One awk idea:

awk 'BEGIN {FS=OFS="/"} {print OFS $(NF-1),$(NF)}' dir.file

This generates:

/MWS/Disks
/BK/Disks

If the results need to be stored for later use it shouldn't be too hard to add some code as needed (eg, redirect to a file, pipe to another process, pass as input to a while/read loop, load into an array, etc).

If OP is processing these strings one at a time (eg, as a variable in a loop), I'd probably stick with a parameter substitution solution (see choroba's answer) which doesn't require any overhead to spawn subprocesses.

1 Comment

See my edits. They answer your questions. Your awk solution is working but I'll confirm when I have tried this in my script
1

Cygwin has find right?

So can you do:

cd 'C:/cygwin64/' # I guess?

find "/home/$USER/work/CMI/ARCH" -mindepth 2 -maxdepth 2 -type d -name Disks |
sed 's=/Disks$=='

Or this, to list all of them:

find /home/*/work/CMI/ARCH/*/ -type d -name Disks |
sed 's=/Disks$=='

If you just want to edit the string, you can use prefix removal:

$ path=/home/Leo/Work/CMI/ARCH/MWS/Disks
$ path=/${path#/*/*/*/*/*/*}
$ echo "$path"
/MWS/Disks

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.