2

Using Matlab, say that we have a cell array of cell arrays. For example:

C = { {'hello' 'there' 'friend'}, {'do' 'say' 'hello'}, {'or' 'maybe' 'not'} }

I would like to find the index of all of the cell arrays in C that contain the string 'hello'. In this case, I would expect 1 and 2, because the 1st cell array has 'hello' in the first slot and the 2nd cell array has it in the third slot.

This would be quite a bit easier I imagine using a matrix (a simple find) but for educational purposes, I'd like to learn the process using a cell array of cell arrays as well.

Many thanks in advance.

3 Answers 3

5

Straight-forward Approaches

With arrayfun -

out = find(arrayfun(@(n) any(strcmp(C{n},'hello')),1:numel(C)))

With cellfun -

out = find(cellfun(@(x) any(strcmp(x,'hello')),C))

Alternative Approach

You can adopt a new approach that translates the input of cell array of cell arrays of strings to cell array of strings, thus reducing one level "cell hierarchy". Then, it performs strcmp and thus avoids cellfun or arrayfun, which might make it faster than earlier listed approaches. Please note that this approach would make more sense from performance point of view, if the number of cells in each cell of the input cell array don't vary a lot, since that translation leads to a 2D cell array with empty cells filling up empty places.

Here's the implementation -

%// Convert cell array of cell ararys to a cell array of strings, i.e.
%// remove one level of "cell hierarchy"
lens = cellfun('length',C)
max_lens = max(lens) 
C1 = cell(max_lens,numel(C))
C1(bsxfun(@le,[1:max_lens]',lens)) = [C{:}]  %//'

%// Use strsmp without cellfun and this might speed it up
out = find(any(strcmp(C1,'hello'),1))

Explanation:

[1] Convert cell array of cell arrays of strings to cell array of strings:

C = { {'hello' 'there' 'friend'}, {'do' 'hello'}, {'or' 'maybe' 'not'} }

gets converted to

C1 = {
    'hello'     'do'       'or'   
    'there'     'hello'    'maybe'
    'friend'         []    'not'  }

[2] For each column find if there's any string hello and find those column IDs as the final output.

Sign up to request clarification or add additional context in comments.

4 Comments

Nice! I was tempted to post a regexp-based solution (same scenario as yesterday haha) but strcmp is the best choice here :)
@Benoit_11 Would be nice to have alternatives for OP to work with ! :)
Thanks, I went with this solution as it was faster than using regular expressions. Works as expected!
@Manbearpig Added one more approach, check that out on how it performs runtime-wise on your data!? Thanks.
5

Here is a way using regular expressions, which I think is far less efficient than @Divakar's strcmp solution, but that could be informative anyway.

regexp operates on cell arrays, but since C is a cell array of cell arrays, we need to use cellfun to get a logical cell array of cell arrays, after which we use cellfun once more to fetch the indices of matches. Actually I might be using unnecessary steps but I figured it was more intuitive that way

Code:

clear
clc

C = { {'hello' 'there' 'friend'}, {'do' 'say' 'hello'}, {'or' 'maybe' 'not'} }

CheckWord = cellfun(@(x) regexp(x,'hello'),C,'uni',false);

Here CheckWord is a cell array of cell arrays containing either 0 or 1 depending on the matches with the string hello:

CheckWord = 

    {1x3 cell}    {1x3 cell}    {1x3 cell}

To make things a bit clearer, let's reshape CheckWord:

CheckWord = reshape([CheckWord{:}],numel(C),[]).'

CheckWord = 

    [1]    []     []
     []    []    [1]
     []    []     []

Since CheckWord is a cell array, we can use cellfun and find to look for non-empty cells, i.e. those corresponding to matches:

[row col] = find(~cellfun('isempty',CheckWord))

row =

     1
     2

col =

     1
     3

Therefore the cells containing the word "hello" are the 1st and 2nd.

Hope that helps!

1 Comment

You're welcome! That's what this site is made for :)
0

Assuming the inner cell arrays are horizontal and equal-sized (as in your example), and that you want to find exact matches of the string:

result = find(any(strcmp(vertcat(C{:}),'hello'), 2));

This works as follows:

  1. Convert your cell array of cell arrays of strings C into a 2D cell array of strings: vertcat(C{:})
  2. Compare each string with the sought string ('hello'): strcmp(...,'hello')
  3. Find indices of rows in which a match was found: find(any(..., 2))

1 Comment

Unfortunately, the inner cell arrays aren't going to be equal sized. My input was just coincidentally as such. But thanks regardless!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.