0

I constructed a variable by parsing a text file with some addresses.

FileInfo fi = new FileInfo(@"C:\temp\Addresses.txt")
var ZipCodesAndCountryCodes = File.ReadLines(fi.FullName)
            .Select(l => new 
                         {
                           ZipCode = l.Substring(1395, 5),
                           CountryCode =  String.IsNullOrWhiteSpace(l.Substring(1405,30))
                                          ? "US"
                                          : l.Substring(1405,30)
                         });

In this code, I'm replacing any blank value for country with "US". However I also want to normalize it to "US", if the country is "United States" or "United States of America" or "USA". How can I do that in LINQ? If it is any other country it should be included as it is.

Speed is a consideration too as the text files I'll be parsing will be 800MB or so. Thank you for any help.

UPDATE1: I'm getting this error when I tried Mark's and Aush's answers:

System.ObjectDisposedException: Cannot read from a closed TextReader.
at System.IO.__Error.ReaderClosed()
at System.IO.StreamReader.ReadLine()
at System.IO.File.<InternalReadLines>d__0.MoveNext()
at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
at System.Linq.Lookup`2.Create[TSource](IEnumerable`1 source, Func`2 keySelector, Func`2 elementSelector, IEqualityComparer`1 comparer)
at System.Linq.GroupedEnumerable`3.GetEnumerator()
at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
at AnthemMDTS.Program.Main(String[] args) in  c:\Projects\CustomerA\CustomerATax\Program.cs:line 100

What is the TextReader in question here? I'm not closing anything nor there is any looping going on in the code.

2
  • do you have to use LINQ? with a file that big you should be using a filereader to parse data then insert it into a construct Commented Mar 3, 2015 at 20:01
  • LINQ is used mainly because of some grouping and aggregation that needs to be done and also because there is no database involved. Commented Mar 3, 2015 at 20:09

4 Answers 4

3

You can use the let clause in a query expression to store the result of Substring() for the country name.

var ZipCodesAndCountryCodes = from line in File.ReadLines(fi.FullName)
                              let country = line.Substring(1405,30)
                              select new                            
                              {
                                  ZipCode = line.Substring(1395, 5),
                                  CountryCode = (   string.IsNullOrWhiteSpace(country)
                                                 || country=="United States"
                                                 || country=="United States of America"
                                                 || country=="USA")
                                                 ? "US"
                                                 : country
                              };
Sign up to request clarification or add additional context in comments.

4 Comments

Mark, I tried your answer; getting System.ObjectDisposedException: Cannot read from a closed TextReader. at this line. Any idea why?
The enumerator returned by File.ReadLines() uses a TextReader which seems to be closed by a call to the GroupBy() operator. You can use File.ReadAllLines() instead, to dump it all into an array first.
With ReadAllLines() I get OutOfMemory Exception when.; the text file is so large (~ 800MB - 1.2GB). It worked in the test machines though, with the sample files.
You're going to have to write your own line-by-line stream reader that accepts a lambda expression that you can use for the LINQ query.
1
FileInfo fi = new FileInfo(@"C:\temp\Addresses.txt")
var ZipCodesAndCountryCodes = File.ReadLines(fi.FullName).Select(l => 
{
    var countrySubstr = l.Substring(1405,30);
    return new 
    {
        ZipCode = l.Substring(1395, 5),
        CountryCode = string.IsNullOrWhiteSpace(countrySubstr)
                    || countrySubstr == "USA"
                    || countrySubstr == "United States"
                    || countrySubstr == "United States of America"
                        ? "US" : countrySubstr
    };
});

1 Comment

Getting System.ObjectDisposedException: Cannot read from a closed TextReader.
1

I'd probably use GroupJoin to essentially LEFT OUTER JOIN the values with predefined mappings.

Dictionary<string, string> mappings = new Dictionary<string, string>()
{
    { "United States", "US" },
    { "United States of America", "US" },
    { "USA", "US" }
};

return ZipCodesAndCountryCodes
           .GroupJoin(mappings,
                      a => a.CountryCode,
                      b => b.Key,
                      (a, b) => new { 
                                        a.ZipCode,
                                        CountryCode = b.Select(x => x.Value).FirstOrDefault() ?? a.CountryCode
                                    },
                      StringComparer.CurrentCultureIgnoreCase);

This allows you to easily add mappings, and it will default to the present one if no mapping exists.

The main advantage of this approach is the ability to modify mappings without extensive changes to code or the requirement to uphold any logic (ensuring proper parentheses around logical ORs, etc.) therein.

If you literally meant that those are the only ones you'll ever encounter, it's probably easiest to use another approach. And as someone who's dealt with similar types of files before, I would expect there to be other values that you'll want to normalize pretty quickly.

Comments

0
string[] textToSearch = new []{"US","United States","United States of America", "USA"}; 
FileInfo fi = new FileInfo(@"C:\temp\Addresses.txt")
var ZipCodesAndCountryCodes = File.ReadLines(fi.FullName).Select(l => new 
{
    ZipCode = l.Substring(1395, 5),
    CountryCode = (string.IsNullOrWhiteSpace(l.Substring(1405,30)
                  || textToSearch.Contains(l.Substring(1405,30))
                      ? "US"
                      : l.Substring(1405,30)
});

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.