5

I have a string input which looks like this var input = "AB-PQ-EF=CD-IJ=XY-JK". I want to know if there is a way using string.split() method in C# and LINQ such that I can get an array of strings which looks like this var output = ["AB-PQ", "PQ-EF", "EF=CD", "CD-IJ", "IJ=XY", "XY-JK"]. Currently I am doing the same conversion manually by iterating the input string.

7
  • 1
    As there is no fixed delimiter in your string for splitting, you need to manually iterate and split the text Commented Jul 13, 2018 at 11:25
  • I know that there will be only two delimiters '-', '=' Commented Jul 13, 2018 at 11:28
  • But those are also present where you not need to split the string Commented Jul 13, 2018 at 11:28
  • Yes, that is why I was thinking if there is a way to combine the capability of LINQ with split() to achieve this. Commented Jul 13, 2018 at 11:31
  • 2
    Have you considered use regular expressions? Commented Jul 13, 2018 at 11:35

7 Answers 7

7

Can you use a regex instead of split?

var input = "AB-PQ-EF=CD-IJ=XY-JK";
var pattern = new Regex(@"(?<![A-Z])(?=([A-Z]+[=-][A-Z]+))");
var output = pattern.Matches(input).Cast<Match>().Select(m => m.Groups[1].Value).ToArray();
Sign up to request clarification or add additional context in comments.

1 Comment

How to use regex if I also want to do the conversion output => input ?
0

Here is a working script. If you had a constant fixed delimiter, you'd only be looking at a single call to Regex.split. Your original string doesn't have that, but we can easily enough make some duplications in that input such that the string becomes splittable.

string input = "ABC-PQ-EF=CD-IJ=XYZ-JK";
string s = Regex.Replace(input, @"((?<=[=-])[A-Z]+(?=[=-]))", "$1~$1");
Console.WriteLine(s);
var items = Regex.Split(s, @"(?<=[A-Z]{2}[=-][A-Z]{2})[~]");
foreach (var item in items)
{
    Console.WriteLine(item);
}

ABC-PQ~PQ-EF~EF=CD~CD-IJ~IJ=XYZ~XYZ-JK
ABC-PQ
PQ-EF
EF=CD
CD-IJ
IJ=XYZ
XYZ-JK

Demo

If you look closely at the very first line of the output above, you'll see the trick I used. I just connected the pairs you want via a different delimiter (ideally ~ does not appear anywhere else in your string). Then, we just have to split by that delimiter.

9 Comments

AS the OP said, the number of chars is not fixed to 2.
@Haytam I generalized my answer to cover a variable number of characters.
It's not not-having-a-constant-fixed-delimiter that stops a single call to Regex.Split from working, it's having to duplicate the letters.
@Rawling I don't understand your comment, or what you are trying to say here. If my answer has a flaw, then point it out.
Test it with the string ABC-PQ-EFZ=CD-IJ=XYZ-JK, the last output is wrong XYZ-J JK
|
0

For a solution using string.Split and LINQ, we just need to track the length of each part as we go so that the separator can be pulled from the original string, like so:

var input = "ABC-PQ-EF=CDED-IJ=XY-JKLM";

var split = input.Split('-', '=');

int offset = 0;

var result = split
            .Take(split.Length - 1)
            .Select((part, index) => {
                offset += part.Length;
                return $"{part}{input[index + offset]}{split[index + 1]}";})
            .ToArray();

Comments

0

You can try below approach: Here we will split the string based on special chars.Then we will loop over the elements and select until next char group. ex: Get AB and get values till PQ

        string valentry = "AB-PQ-EF=CD-IJ=XY-JK";
        List<string> filt = Regex.Split(valent, @"[\-|\=]").ToList();

        var listEle = new List<string>();
        fil.ForEach(x => 
            {
                if (valentry .IndexOf(x) != valentry .Length - 2)
                {
                    string ele = valentry.Substring(valentry .IndexOf(x), 5);
                    if (!String.IsNullOrEmpty(ele))
                        listEle.Add(ele);
                }
            });

enter image description here

Comments

0

Could you adapt something like this? Just need to change the factorization.

        List<string> lsOut = new List<string>() { };

        string sInput = "AB-PQ-EF=CD-IJ=XY-JK";
        string sTemp = "";


        for (int i = 0; i < sInput.Length; i++)
        {

            if ( (i + 1) % 6 == 0)
            {
                continue;
            }

            // add to temp
            sTemp += sInput[i];

            // multiple of 5, add all the temp to list
            if ( (i + 1 - lsOut.Count) % 5 == 0)
            {
                lsOut.Add(sTemp);
                sTemp = "";
            }

            if(sInput.Length == i + 1)
            {
                lsOut.Add(sTemp);
            }

        }

Comments

0

Recently learning Haskell, so here is a recursive solution.

static IEnumerable<string> SplitByPair(string input, char[] delimiter)
{
    var sep1 = input.IndexOfAny(delimiter);
    if (sep1 == -1)
    {
        yield break;
    }
    var sep2 = input.IndexOfAny(delimiter, sep1 + 1);
    if (sep2 == -1)
    {
        yield return input;
    }
    else
    {
        yield return input.Substring(0, sep2);
        foreach (var other in SplitByPair(input.Substring(sep1 + 1), delimiter))
        {
            yield return other;
        }
    }
}

Good things are

  • It's lazy
  • Easy to extend to other conditions and other data types. However, it's a little hard in C# because C# lacks of Haskell's List.span and pattern match.

Comments

0
        string input = "AB-PQ-EF=CD-IJ=XY-JK";
        var result = new Regex(@"(?<![A-Z])(?=([A-Z]+[=-][A-Z]+))").Matches(input)
            .Cast<Match>().Select(m => m.Groups[1].Value).ToArray();
        foreach (var item in result)
        {
            Console.WriteLine(item);
        }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.