I have a string input which looks like this var input = "AB-PQ-EF=CD-IJ=XY-JK".
I want to know if there is a way using string.split() method in C# and LINQ such that I can get an array of strings which looks like this var output = ["AB-PQ", "PQ-EF", "EF=CD", "CD-IJ", "IJ=XY", "XY-JK"]. Currently I am doing the same conversion manually by iterating the input string.
-
1As there is no fixed delimiter in your string for splitting, you need to manually iterate and split the textIpsit Gaur– Ipsit Gaur2018-07-13 11:25:03 +00:00Commented Jul 13, 2018 at 11:25
-
I know that there will be only two delimiters '-', '='pango89– pango892018-07-13 11:28:05 +00:00Commented Jul 13, 2018 at 11:28
-
But those are also present where you not need to split the stringIpsit Gaur– Ipsit Gaur2018-07-13 11:28:38 +00:00Commented Jul 13, 2018 at 11:28
-
Yes, that is why I was thinking if there is a way to combine the capability of LINQ with split() to achieve this.pango89– pango892018-07-13 11:31:30 +00:00Commented Jul 13, 2018 at 11:31
-
2Have you considered use regular expressions?mnieto– mnieto2018-07-13 11:35:35 +00:00Commented Jul 13, 2018 at 11:35
7 Answers
Can you use a regex instead of split?
var input = "AB-PQ-EF=CD-IJ=XY-JK";
var pattern = new Regex(@"(?<![A-Z])(?=([A-Z]+[=-][A-Z]+))");
var output = pattern.Matches(input).Cast<Match>().Select(m => m.Groups[1].Value).ToArray();
1 Comment
Here is a working script. If you had a constant fixed delimiter, you'd only be looking at a single call to Regex.split. Your original string doesn't have that, but we can easily enough make some duplications in that input such that the string becomes splittable.
string input = "ABC-PQ-EF=CD-IJ=XYZ-JK";
string s = Regex.Replace(input, @"((?<=[=-])[A-Z]+(?=[=-]))", "$1~$1");
Console.WriteLine(s);
var items = Regex.Split(s, @"(?<=[A-Z]{2}[=-][A-Z]{2})[~]");
foreach (var item in items)
{
Console.WriteLine(item);
}
ABC-PQ~PQ-EF~EF=CD~CD-IJ~IJ=XYZ~XYZ-JK
ABC-PQ
PQ-EF
EF=CD
CD-IJ
IJ=XYZ
XYZ-JK
Demo
If you look closely at the very first line of the output above, you'll see the trick I used. I just connected the pairs you want via a different delimiter (ideally ~ does not appear anywhere else in your string). Then, we just have to split by that delimiter.
9 Comments
Regex.Split from working, it's having to duplicate the letters.ABC-PQ-EFZ=CD-IJ=XYZ-JK, the last output is wrong XYZ-J JKFor a solution using string.Split and LINQ, we just need to track the length of each part as we go so that the separator can be pulled from the original string, like so:
var input = "ABC-PQ-EF=CDED-IJ=XY-JKLM";
var split = input.Split('-', '=');
int offset = 0;
var result = split
.Take(split.Length - 1)
.Select((part, index) => {
offset += part.Length;
return $"{part}{input[index + offset]}{split[index + 1]}";})
.ToArray();
Comments
You can try below approach:
Here we will split the string based on special chars.Then we will loop over the elements and select until next char group.
ex: Get AB and get values till PQ
string valentry = "AB-PQ-EF=CD-IJ=XY-JK";
List<string> filt = Regex.Split(valent, @"[\-|\=]").ToList();
var listEle = new List<string>();
fil.ForEach(x =>
{
if (valentry .IndexOf(x) != valentry .Length - 2)
{
string ele = valentry.Substring(valentry .IndexOf(x), 5);
if (!String.IsNullOrEmpty(ele))
listEle.Add(ele);
}
});
Comments
Could you adapt something like this? Just need to change the factorization.
List<string> lsOut = new List<string>() { };
string sInput = "AB-PQ-EF=CD-IJ=XY-JK";
string sTemp = "";
for (int i = 0; i < sInput.Length; i++)
{
if ( (i + 1) % 6 == 0)
{
continue;
}
// add to temp
sTemp += sInput[i];
// multiple of 5, add all the temp to list
if ( (i + 1 - lsOut.Count) % 5 == 0)
{
lsOut.Add(sTemp);
sTemp = "";
}
if(sInput.Length == i + 1)
{
lsOut.Add(sTemp);
}
}
Comments
Recently learning Haskell, so here is a recursive solution.
static IEnumerable<string> SplitByPair(string input, char[] delimiter)
{
var sep1 = input.IndexOfAny(delimiter);
if (sep1 == -1)
{
yield break;
}
var sep2 = input.IndexOfAny(delimiter, sep1 + 1);
if (sep2 == -1)
{
yield return input;
}
else
{
yield return input.Substring(0, sep2);
foreach (var other in SplitByPair(input.Substring(sep1 + 1), delimiter))
{
yield return other;
}
}
}
Good things are
- It's lazy
- Easy to extend to other conditions and other data types. However, it's a little hard in C# because C# lacks of Haskell's List.span and pattern match.
