1

I have a cell array consisting of numbers, strings, and empty arrays. I want to find the position (linear or indexed) of all cells containing a string in which a certain substring of interest appears.

mixedCellArray = {
   'adpo' 2134  []
   0 [] 'daesad'
   'xxxxx' 'dp' 'dpdpd'
}

If the substring of interest is 'dp', then I should get the indices for three cells.

The only solutions I can find work when the cell array contains only strings:

One work-around is to find all cells not containing strings, and fill them with '', as hinted by this posting. Unfortunately, my approach requires a variation of that solution, probably something like cellfun('ischar',mixedCellArray). This causes the error:

Error using cellfun
Unknown option. 

Thanks for any suggestions on how to figure out the error.

I've posted this to usenet

EDUCATIONAL AFTERNOTE: For those who don't have Matlab at home, and end up bouncing back and forth between Matlab and Octave. I asked above why cellfun doesn't accept 'ischar' as its first argument. The answer turns out to be that the argument must be a function handle in Matlab, so you really need to pass @ischar. There are some functions whose names can be passed as strings, for backward compatibility, but ischar is not one of them.

1
  • 3
    See: timing results. Note the performance disadvantage of tweet-style one line coding. Commented Jan 25, 2017 at 17:59

4 Answers 4

4

How about this one-liner:

>> mixedCellArray = {'adpo' 2134  []; 0 [] 'daesad'; 'xxxxx' 'dp' 'dpdpd'};
>> index = cellfun(@(c) ischar(c) && ~isempty(strfind(c, 'dp')), mixedCellArray)

index =

  3×3 logical array

   1   0   0
   0   0   0
   0   1   1

You could get by without the ischar(c) && ..., but you will likely want to keep it there since strfind will implicitly convert any numeric values/arrays into their equivalent ASCII characters to do the comparison. That means you could get false positives, as in this example:

>> C = {65, 'A'; 'BAD' [66 65 68]}  % Note there's a vector in there

C =

  2×2 cell array

    [ 65]    'A'         
    'BAD'    [1×3 double]

>> index = cellfun(@(c) ~isempty(strfind(c, 'A')), C)  % Removed ischar(c) &&

index =

  2×2 logical array

   1   1                % They all match!
   1   1
Sign up to request clarification or add additional context in comments.

1 Comment

Wow. That's educational. I need to go back into my code and build in that error trapping. Thanks.
4

Just use a loop, testing with ischar and contains (added in R2016b). The various *funs are basically loops and, in general, do not offer any performance advantage over the explicit loop.

mixedCellArray = {'adpo' 2134  []; 0 [] 'daesad'; 'xxxxx' 'dp' 'dpdpd'};
querystr = 'dp';

test = false(size(mixedCellArray));
for ii = 1:numel(mixedCellArray)
    if ischar(mixedCellArray{ii})
        test(ii) = contains(mixedCellArray{ii}, querystr);
    end
end

Which returns:

test =

  3×3 logical array

   1   0   0
   0   0   0
   0   1   1

Edit:

If you don't have a MATLAB version with contains you can substitute a regex:

test(ii) = ~isempty(regexp(mixedCellArray{ii}, querystr, 'once'));

8 Comments

strfind instead of regexp is simpler. Don't encourage using regexp where simpler specializations exist!
@Naveh Do you have a real reason to recommend strfind over regexp? "is simpler" is a really dumb reason not to recommend a more powerful function; certainly not a profound multiple upvote-worthy comment one. The syntax isn't even any different: regexp(mixedCellArray{ii}, querystr) vs. strfind(mixedCellArray{ii}, querystr). Yippee... What exactly is the gain?
Perhaps my comment was too simple :) The reasoning behind using regexp specializations (strfind, strtrim, strrep etc) instead of regexp itself is readability. My rule of thumb is to only use regexp where no fitting specialization exists. What we want to do is find a substring, so we use strfind. It is simpler to understand, which is why I advocate this.
Naveh, excaza, I want to thank you both for sharing your knowledge of Matlab features for solving this problem. However, I exhort both of you to respect the different merits of your answers. I find them both very educational. Simplicity has a great deal of merit, and for that, I thank Naveh. However, I know that all too soon, I will need to move regular expressions, and I appreciate the gateway code for me to do that, excaza.
@user36800 You should really check out the timing results mentioned in this comment. One liners are typically less legible, more error prone, and slower than explicitly using a loop which MATLAB is able to accelerate naturally.
|
2
z=cellfun(@(x)strfind(x,'dp'),mixedCellArray,'un',0);
idx=cellfun(@(x)x>0,z,'un',0);
find(~cellfun(@isempty,idx))

1 Comment

Thanks, Purushottama. That seems to be similar to the approach proposed on usenet.
0

Here is a solution from the usenet link in my original post:

>> mixedCellArray = {
         'adpo' 2134  []
         0 [] 'daesad'
         'xxxxx' 'dp' 'dpdpd'
      }

mixedCellArray =
    'adpo'     [2134]          []
    [    0]        []    'daesad'
    'xxxxx'    'dp'      'dpdpd'

>> ~cellfun( @isempty , ...
             cellfun( @(x)strfind(x,'dp') , ...
                      mixedCellArray , ...
                      'uniform',0) ...
           )

ans =
     1     0     0
     0     0     0
     0     1     1

The inner cellfun is able to apply strfind to even numerical cells because, I presume, Matlab treats numerical arrays and strings the same way. A string is just an array of numbers representing the character codes. The outer cellfun identifies all cells for which the inner cellfun found a match, and the prefix tilde turns that into all cells for which there was NO match.

Thanks to dpb.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.