3

I have a list of string[].

List<string[]> cardDataBase;

I need to sort that list by each list-item's second string value (item[1]) in custom order.

The custom order is a bit complicated, order by those starting characters:

"MW1"
"FW"
"DN"
"MWSTX1CK"
"MWSTX2FF"

then order by these letters following above starting letters:

"A"
"Q"
"J"
"C"
"E"
"I"
"A"

and then by the numbers following above.

a sample, unordered list left, ordered right:

MW1E10              MW1Q04
MWSTX2FFI06         MW1Q05
FWQ02               MW1E10
MW1Q04              MW1I06
MW1Q05              FWQ02
FWI01               FWI01
MWSTX2FFA01         DNC03
DNC03               MWSTX1CKC02
MWSTX1CKC02         MWSTX2FFI03
MWSTX2FFI03         MWSTX2FFI06
MW1I06              MWSTX2FFA01

I tried Linq but I am not that good in it right now and cannot solve this on my own. Do I need a dictionary, regex or a dictionary with regex in it? What would be the best approach?

1
  • 1
    Your "These Letters" section repeats "A"; this makes any ordering ambiguous. Given your example, it appears that A follows I, and thus that the initial A is in error. Commented Sep 14, 2014 at 7:57

4 Answers 4

1

I think you're approaching this incorrectly. You're not sorting strings, you're sorting structured objects that are misrepresented as strings (somebody aptly named this antipattern "stringly typed"). Your requirements show that you know this structure, yet it's not represented in the datastructure List<string[]>, and that's making your life hard. You should parse that structure into a real type (struct or class), and then sort that.

enum PrefixCode { MW1, FW, DN, MWSTX1CK, MWSTX2FF, }
enum TheseLetters { Q, J, C, E, I, A, }
struct CardRecord : IComparable<CardRecord> {
    public readonly PrefixCode Code;
    public readonly TheseLetters Letter;
    public readonly uint Number;
    public CardRecord(string input) {
        Code = ParseEnum<PrefixCode>(ref input);
        Letter = ParseEnum<TheseLetters>(ref input);
        Number = uint.Parse(input);
    }
    static T ParseEnum<T>(ref string input) { //assumes non-overlapping prefixes
        foreach(T val in Enum.GetValues(typeof(T))) {
            if(input.StartsWith(val.ToString())) {
                input = input.Substring(val.ToString().Length);
                return val;
            }
        }
        throw new InvalidOperationException("Failed to parse: "+input);
    }
    public int CompareTo(CardRecord other) {
        var codeCmp = Code.CompareTo(other.Code);
        if (codeCmp!=0) return codeCmp;
        var letterCmp = Letter.CompareTo(other.Letter);
        if (letterCmp!=0) return letterCmp;
        return Number.CompareTo(other.Number);
    }
    public override string ToString() { 
        return Code.ToString() + Letter + Number.ToString("00");
    }
}

A program using the above to process your example might then be:

static class Program {
    static void Main() {
        var inputStrings = new []{ "MW1E10", "MWSTX2FFI06", "FWQ02", "MW1Q04", "MW1Q05", 
            "FWI01", "MWSTX2FFA01", "DNC03", "MWSTX1CKC02", "MWSTX2FFI03", "MW1I06" };
        var outputStrings = inputStrings
            .Select(s => new CardRecord(s))
            .OrderBy(c => c)
            .Select(c => c.ToString());
        Console.WriteLine(string.Join("\n", outputStrings));
    }
}

This generates the same ordering as in your example. In real code, I'd recommend you name the types according to what they represent, and not, for example, TheseLetters.

This solution - with a real parse step - is superior because it's almost certain that you'll want to do more with this data at some point, and this allows you to actually access the components of the data easily. Furthermore, it's comprehensible to a future maintainer since the reason behind the ordering is somewhat clear. By contrast, if you chose to do complex string-based processing it's often very hard to understand what's going on (especially if it's part of a larger program, and not a tiny example as here).

Making new types is cheap. If your method's return value doesn't quite "fit" in an existing type, just make a new one, even if that means 1000's of types.

Sign up to request clarification or add additional context in comments.

2 Comments

Wow, I didn't expect such detailed and fast answers, Thank you guys! Your approach seems like good practice and you are right, i need this data again at some point. To tell you more about my case, these names are a big growing List of card textures that are additionally the actual identifiers of the cards in real life (the card game is Mage Wars). The Prefix is the Expansion name, "TheseLetters" are the Card Type and the numbers are indices restarting at 01 for each type in an expansion. Thank You, i learned smth. today!
Yeah, I've just seen this mistake once too often. People make these hypercomplicated solutions to process their data - and it works - but it's really hard to change or understand later on, even if you're the person that wrote the original code :-). It's write-only code. Dont' be afraid of intermediate solutions: I'd argue that programming is all about encapsulating solutions to problems that are trivial, and then composing those solutions in every bigger chunks until you get something useful.
1

A bit spoonfeeding, but I found this question pretty interesting and perhaps it will be useful for others, also added some comments to explain:

void Main()
{
    var cardDatabase = new List<string>{
        "MW1E10",          
        "MWSTX2FFI06",         
        "FWQ02",               
        "MW1Q04",              
        "MW1Q05",              
        "FWI01",               
        "MWSTX2FFA01",         
        "DNC03",               
        "MWSTX1CKC02",         
        "MWSTX2FFI03",        
        "MW1I06",  
    };


    var orderTable = new List<string>[]{
        new List<string>
        {
            "MW1",
            "FW",
            "DN",
            "MWSTX1CK",
            "MWSTX2FF"
        },

        new List<string>
        {
            "Q",
            "J",
            "C",
            "E",
            "I",
            "A"
        }
    };


    var test = cardDatabase.Select(input => {
        var r = Regex.Match(input, "^(MW1|FW|DN|MWSTX1CK|MWSTX2FF)(A|Q|J|C|E|I|A)([0-9]+)$");
        if(!r.Success) throw new Exception("Invalid data!");

        // for each input string,
        // we are going to split it into "substrings",
        // eg: MWSTX1CKC02 will be
        // [MWSTX1CK, C, 02]
        // after that, we use IndexOf on each component
        // to calculate "real" order,

        // note that thirdComponent(aka number component)
        // does not need IndexOf because it is already representing the real order,
        // we still want to convert string to integer though, because we don't like
        // "string ordering" for numbers.

        return  new 
        {
            input = input,
            firstComponent = orderTable[0].IndexOf(r.Groups[1].Value), 
            secondComponent = orderTable[1].IndexOf(r.Groups[2].Value), 
            thirdComponent = int.Parse(r.Groups[3].Value)
        };

        // and after it's done,
        // we start using LINQ OrderBy and ThenBy functions
        // to have our custom sorting.
    })
    .OrderBy(calculatedInput => calculatedInput.firstComponent)
    .ThenBy(calculatedInput => calculatedInput.secondComponent)
    .ThenBy(calculatedInput => calculatedInput.thirdComponent)
    .Select(calculatedInput => calculatedInput.input)
    .ToList();


    Console.WriteLine(test);
}

Comments

0

You can use the Array.Sort() method. Where your first parameter is the string[] you're sorting and the second parameter contains the complicated logic of determining the order.

Comments

0

You can use the IEnumerable.OrderBy method provided by the System.Linq namespace.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.