0

I've written a c# application that reads JSON files line by line and write csv files from it. I've create Model files for each csv format, objects for those model gets instantiated while parsing and are then written to csv at the end.

For Ex: if input file name is abc.json, create and instantiate object for abc, store it in data structure like List and then write it to csv at the end.

JSON file:

{
  "Computer ID": "1697343078",
  "Application Name": "Reporting Services Service",
  "Date": "8\/25\/2015",
  "Count": "1"
}

My code to parse is as follows:

using (System.IO.StreamReader sr = new System.IO.StreamReader(sFile, Encoding.UTF8))
while ((line = sr.ReadLine()) != null)
                        {
    if (line.Contains("Computer ID") && counter == 4)
            {
              string[] tokens = line.Split(':');
              if (tokens.Length >= 2)
              {
                  resourceID = reg.Replace(tokens[1], "");
              }
              counter = counter - 1;
              line = sr.ReadLine();
          }
}

The parsing fails because of inconsistent format of data or other fields in input file. Code throws exception and parsing of that particular file fails completely. I want my code to reject the record for which parsing and continue parsing other records in the file and to finally generate a csv for it.

I want it to behave as below, Read the file line by line If any error occurs while parsing, don't instantiate that object and continue parsing other lines for that file Write the object to csv

Any help would be appreciated.

13
  • 1
    Please provide a small example of JSON with an error, and the corresponding CSV output you would like. Commented Sep 9, 2015 at 19:38
  • Also, explain what the error is (is it an exception, is it something else...?) and show your code that is related to the error... Commented Sep 9, 2015 at 19:40
  • Do this in the question post and not in the comment area. ;) Commented Sep 9, 2015 at 19:41
  • @Anky, please add your additional information to the question. Do not put it here in the comments. It is hard to follow and understand your problem if you spread the information about the problem "everywhere" ;-) Commented Sep 9, 2015 at 19:42
  • 5
    why don't you use a real json parser? Commented Sep 9, 2015 at 19:54

2 Answers 2

2

You can use Json.NET to parse you JSON data. To do this, you need to:

  1. Define classes corresponding to your JSON objects.
  2. Where a DateTime property appears, declare it as a nullable. In the event a string like "No date found, so no data returned" is encountered, a null value can thus be stored in the property.
  3. Create your own DateTimeConverter that, when parsing a nullable DateTime, tries the various date time formats that you might encounter. If an invalid format is encountered, return null rather than throwing an exception.
  4. Apply it to your DateTime properties using JsonConverterAttribute.

Thus, given the following converter:

public class DateTimeConverter : IsoDateTimeConverter
{
    public override bool CanConvert(Type objectType)
    {
        return objectType == typeof(DateTime?);
    }

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    {
        if (reader.TokenType == JsonToken.Null)
            return null;

        var token = JToken.Load(reader);

        // For various JSON date formats, see
        // http://www.newtonsoft.com/json/help/html/DatesInJSON.htm

        // Try in JavaScript constructor format: new Date(1234656000000)
        if (token.Type == JTokenType.Constructor)
        {
            try
            {
                var result = token.ToObject<DateTime?>(JsonSerializer.CreateDefault(new JsonSerializerSettings { Converters = new JsonConverter[] { new JavaScriptDateTimeConverter() } }));
                if (result != null)
                    return result;
            }
            catch (JsonException)
            {
            }
        }

        // Try ISO format: "2009-02-15T00:00:00Z"
        if (token.Type == JTokenType.String)
        {
            try
            {
                var result = token.ToObject<DateTime?>(JsonSerializer.CreateDefault(new JsonSerializerSettings { DateFormatHandling = DateFormatHandling.IsoDateFormat }));
                if (result != null)
                    return result;
            }
            catch (JsonException)
            {
            }
        }

        // Try Microsoft format: "\/Date(1234656000000)\/"
        if (token.Type == JTokenType.String)
        {
            try
            {
                var result = token.ToObject<DateTime?>(JsonSerializer.CreateDefault(new JsonSerializerSettings { DateFormatHandling = DateFormatHandling.MicrosoftDateFormat }));
                if (result != null)
                    return result;
            }
            catch (JsonException)
            {
            }
        }

        if (token.Type == JTokenType.String)
        {
            // Add other custom cases as required.
        }

        return null;
    }
}

You would apply it to your class as follows:

public class ComputerData
{
    [JsonProperty("Computer ID")]
    public string ComputerID { get; set; }

    [JsonProperty("Application Name")]
    public string ApplicationName { get; set; }

    [JsonConverter(typeof(DateTimeConverter))]
    public DateTime? Date { get; set; }

    public int Count { get; set; }
}

Example fiddle.

Sign up to request clarification or add additional context in comments.

Comments

-1

You did do not need a JSON parser. All you need is try catch block to ignore the exception during parsing the data in the line where data format is incorrect. By checking conditions, you should be able to handle that.

Here might be a solution, I did something like this before, but I then wrote a JSON parser my self at the end....

Entity class

 public class Info {
     public string ComputerID {get;set;}
     public string ApplicationName {get;set;}
     ...
 }

For parsing the text and ignore the errors when you parse the line

 Info record = null;
 var recordSet = new List<Info>();
 using (System.IO.StreamReader sr = new System.IO.StreamReader(sFile, Encoding.UTF8))
 {
    while ((line = sr.ReadLine()) != null)
    {
         if (record==null){
                record = new Info();
                recordSet.Add(record)
         }
         try {
         } catch ( Exception e) {
          //You either log the data or ignore the exception here
         }
         //Check your property here, replace with your actual implementation here
         if (record.ComputerID!=null && record.ApplicationName!=null) {
               record = null;
         }

    }
 }

With things getting complicated, you might still need a parser to handle that, but that really depends on your needs.

4 Comments

This is bad advice. When parsing JSON data, you really should use a JSON parser if possible. Trying to take a shortcut with Regex or Split is a good way to introduce subtle bugs into your program.
@BrianRogers OK. Parser win. But I have to say, when anky is asking for a way to Read the file line by line If any error occurs while parsing, don't instantiate that object and continue parsing other lines for that file Write the object to csv, you just telling hime using a JSON parser will do the trick. And did the parser do?
Fair enough, you're answering the question exactly as asked. But I really think his whole approach is wrong. He is saying that he has formatting inconsistencies in his data and is looking for a way to handle them. It is much easier to do this using a parser. See @dbc's answer for a better approach.
@BrianRogers I don't the see the answer reduce the room for bugs. To write a stable program, you could NOT just rely on a third library.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.