Matlab Extracting sub string from cell array

Question

I have a '3 x 1' cell array the contents of which appear like the following:

'ASDF_LE_NEWYORK Fixedafdfgd_ML'
'Majo_LE_WASHINGTON FixedMonuts_ML'
'Array_LE_dfgrt_fdhyuj_BERLIN Potato Price'

I want to be able to elegantly extract and create another '3x1' cell array with contents as:

'NEWYORK'
'WASHINGTON'
'BERLIN'

If you notice in above the NAME's are after the last underscore and before the first SPACE or '_ML'. How do I write such code in a concise manner.

Thanks

Edit:

Sorry guys I should have used a better example. I have it corrected now.

The names aren't after the last underscore, at least not in the first two entries. — jedwards
– jedwards, Commented Sep 23, 2013 at 23:09
I updated my answer to get the output in the format you requested. — chappjc
– chappjc, Commented Sep 23, 2013 at 23:43

Mohsen Nosratinia · Accepted Answer · 2013-09-24 06:57:30Z

2

You can use lookbehind for _ and lookahead for space:

names = regexp(A, '(?<=_)[^\s_]*(?=\s)', 'match', 'once');

Where A is the cell array containing the strings:

A = {...
'ASDF_LE_NEWYORK Fixedafdfgd_ML'
'Majo_LE_WASHINGTON FixedMonuts_ML'
'Array_LE_dfgrt_fdhyuj_BERLIN Potato Price'};

>> names = regexp(A, '(?<=_)[^\s_]*(?=\s)', 'match', 'once')
names = 
    'NEWYORK'
    'WASHINGTON'
    'BERLIN'

answered Sep 24, 2013 at 6:57

Mohsen Nosratinia

9,8641 gold badge29 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Zanam Over a year ago

It works. But can you please explain how to read this: '(?<=_)[^\s_]*(?=\s)' ?

Mohsen Nosratinia Over a year ago

(?<=_) looks for a _ before the matching string but doesn't include it in the match and (?=\s) looks for a space after matching string and doesn't include it in the match and the matching string is [^\s_]* meaning a sequence of non-space, non-underscore characters. See regular-expressions.info/lookaround.html for more info

chappjc · Accepted Answer · 2013-09-24 15:51:49Z

1

NOTE: The question was changed, so the answer is no longer complete, but hopefully the regexp example is still useful.

Try regexp like this:

names = regexp(fullNamesCell,'_(NAME\d?)\s','tokens');
names = cellfun(@(x)(x{1}),names)

In the pattern _(NAME\d?)\s, the parenthesis define a subexpression, which will be returned as a token (a portion of matched text). The \d? specifies zero or one digits, but you could use \d{1} for exactly one digit or \d{1,3} if you expect between 1 and 3 digits. The \s specified whitespace.

The reorganization of names is a little convoluted, but when you use regexp with a cell input and tokens you get a cell of cells that needs some reformatting for your purposes.

edited Sep 24, 2013 at 15:51

answered Sep 23, 2013 at 23:09

chappjc

30.6k7 gold badges79 silver badges138 bronze badges

Collectives™ on Stack Overflow

Matlab Extracting sub string from cell array

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related