0

I have an array, e.g.

char[] myArr = {'a', 'b', '1', '2', 'c', 'd', '1', '2', 'e', 'f'}

in this case, the delimiting subsequence is {'1', '2'}.

I want to get the array split by this sequence and have in result a list of arrays:

{'a', 'b'}
{'c', 'd'}
{'e', 'f'}

What is the fastest way to do it?

3
  • 2
    should you add the #homework tag? Commented Jan 24, 2012 at 23:33
  • ... sounds like a school question to test your logic, not your syntax. Commented Jan 24, 2012 at 23:40
  • Nope, it's not a homework. I use this to parse sniffed data packages from byte[] array. I post this char[] question just as example and it seems that repliers don't understand me right. I need universal method for such parsing, not for parsing only char[] arrays. Commented Jan 25, 2012 at 12:05

4 Answers 4

2
new string(myArr)
    .Split(new[] { "12" }, StringSplitOptions.None)
    .Select(s => s.ToCharArray())
        .ToList();

Or, if you mean "array" when you say "list", then

new string(myArr)
    .Split(new[] { "12" }, StringSplitOptions.None)
    .Select(s => s.ToCharArray())
        .ToArray();

Also, you might prefer StringSplitOptions.RemoveEmptyEntries.

If this is homework, however, this solution is probably unacceptable.


As you ask about processing bytes, here's an adaptation for that purpose. First, a couple of methods to convert a byte array to a string and back:

string ByteArrayToString(byte[] arr)
{
    char[] charArray = arr.Select(b => (char)b).ToArray();
    return new string(charArray);
}

byte[] StringToByteArray(string s)
{
    //this method maps each char in the string to a single output byte;
    //all chars should be in the range 0 to 255.  The checked 
    //conversion will catch any data that violates this requirement.

    return s.Select(c => checked ( (byte)c )).ToArray();
}

Now, the example code fragment:

byte[] myArr = Whatever();
byte[] myDelim = WhateverElse();

string sourceData = ByteArrayToString(myArr);
string delimiter = ByteArrayToString(myDelim);

string[] splitData = sourceData.Split(new [] { delimiter }, StringSplitOptions.None);
byte[][] result = splitData.Select(StringToByteArray);
Sign up to request clarification or add additional context in comments.

2 Comments

Could you please give example of such parsing for byte[] array?
@Roman Hm, I am not sure why I never responded to your comment; sorry. For byte arrays, this approach won't work, because the Split method is defined in the string class. You could potentially convert the bytes to chars and then apply the method I describe, however; the edited answer shows one approach. However, it's a bit hokey, since chars are not bytes.
1

I would start by using a ForEach loop and the Char.IsLetter function. Show us some code and we will help you.

1 Comment

I don't think the question as posed justifies the use of char.IsLetter -- the OP states that the sequences should be split by the subsequence '1', '2', not that all non-letter characters should be considered as separators, nor that only letters are valid return data.
1

Since you are working with characters, you could convert your myArr array into a string, and use C#'s String.Split method. The result will be an array of strings, and you can split those up into individual characters when you are all done, here is an example:

  char[] myArr = {'a', 'b', '1', '2', 'c', 'd', '1', '2', 'e', 'f'};
  var myArrFlattened = "";
  myArr.ToList().ForEach(c => myArrFlattened += c.ToString());
  var separators = new string[] {"12"}; // put your sequences of characters here as a string
  myArrFlattened.Split(separators, StringSplitOptions.None);

The value of that last line is this array: {"ab", "cd", "ef"}, this code doesn't split the strings up into character lists, to do that, you could use foreach and apply the String.ToCharArray function.

Comments

1
string s = new String(myArr);
string[] parts = s.Split(new string[] {"12"}, StringSplitOptions.None);

You can then convert the results to char arrays with

var list = new List<char[]>();
foreach (string part in parts) {
    list.Add(part.ToCharArray());
}

EDIT: As you need a universal approach, here are two generic solutions.

If the length of the sub-arrays is always the same then you could do it like this:

public List<T[]> GetSubArrays<T>(T[] array)
{
    const int LengthOfSpearator = 2, LengthOfSubArray = 2;
    const int LengthOfPattern = LengthOfSpearator + LengthOfSubArray;

    var list = new List<T[]>();
    for (int i = 0; i <= array.Length - LengthOfSubArray; i += LengthOfPattern) {
        T[] subarray = new T[LengthOfSubArray];
        Array.Copy(array, i, subarray, 0, LengthOfSubArray);
        list.Add(subarray);
    }
    return list;
}

If the length of the sub-arrays is variable, then the algorithm becomes more complicated. We also have to constrain the generic parameter to be IEquatable in order to be able to make the comparison.

public List<T[]> GetSubArrays<T>(T[] array, T[] separator)
    where T : IEquatable<T>
{
    int maxSepIndex = array.Length - separator.Length;
    var list = new List<T[]>();
    for (int i = 0; i <= array.Length; ) {
        // Get index of next separator or array.Length if none is found
        int sepIndex;
        for (sepIndex = i; sepIndex <= maxSepIndex; sepIndex++) {
            int k;
            for (k = 0; k < separator.Length; k++) {
                if (!array[sepIndex + k].Equals(separator[k])) {
                    break;
                }
            }
            if (k == separator.Length) { // Separator found at sepIndex
                break;
            }
        }
        if (sepIndex > maxSepIndex) { // No separator found, subarray goes until end.
            sepIndex = array.Length;
        }

        int lenSubarray = sepIndex - i;
        T[] subarray = new T[lenSubarray];
        Array.Copy(array, i, subarray, 0, lenSubarray);
        list.Add(subarray);
        i = sepIndex + separator.Length;
    }
    return list;
}

3 Comments

Thank you but I need universal way to do this. ActuallyI have byte[] array. I just put this char[] as example.
Oh, so huge code. I know I can do it like that. I thought that there are some easier and faster way to make this because I need to parse thousands packages per minute.
If the delimiting sequence is relatively long, then the Boyer–Moore algorithm might speed up the search of the separator. However, it would make the algorithm even more complicated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.