Valid JSON Load in Python file

Question

Running into a problem with my JSON:

First issue was that SyntaxError: Non-ASCII character '\xe2' in file so I added # -*- coding: utf-8 -*- at the top of my file.

Then the problem became a problem where I load my JSON x = json.loads(x): ValueError: Expecting , delimiter: line 3 column 52 (char 57). I referenced this stackoverflow solution and so added an r in front of my JSON:

x = r"""[
  { my validated json... }
]"""

But then I get an error TypeError: sequence item 3: expected string or Unicode, NoneType found - I think it that the r is throwing it off somehow?

JSON Resembles the following:

[
  {
    "brief": "Brief 1",
    "description": "Description 1",
    "photos": [
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
    ],
    "price": "145",
    "tags": [
      "tag1",
      "tag2",
      "tag3"
    ],
    "title": "Title 1"
  },
  {
    "brief": "Brief 2",
    "description": "Description 2",
    "photos": [
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
    ],
    "price": "150",
    "tags": [
      "tag4",
      "tag5",
      "tag6",
      "tag7",
      "tag8"
    ],
    "title": "Title 2"
  },{
    "brief": "blah blah 5'0\" to 5'4\"",
    "buyerPickup": true,
    "condition": "Good",
    "coverShipping": false,    
    "description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n  \r\n\r\n",
    "photos": [
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
    ],
    "price": "240",
    "tags": [
      "tag2",
      "5'0\"-5'4\""
    ],
    "title": "blah blah 17\" Frame",
    "front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"    
  } 
]

CURRENT CODE

# -*- coding: utf-8 -*-

import csv
import json

x = """[
      {
        "brief": "Brief 1",
        "description": "Description 1",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
        ],
        "price": "145",
        "tags": [
          "tag1",
          "tag2",
          "tag3"
        ],
        "title": "Title 1"
      },
      {
        "brief": "Brief 2",
        "description": "Description 2",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
        ],
        "price": "150",
        "tags": [
          "tag4",
          "tag5",
          "tag6",
          "tag7",
          "tag8"
        ],
        "title": "Title 2"
      },{
        "brief": "blah blah 5'0\" to 5'4\"",
        "buyerPickup": true,
        "condition": "Good",
        "coverShipping": false,    
        "description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n  \r\n\r\n",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
        ],
        "price": "240",
        "tags": [
          "tag2",
          "5'0\"-5'4\""
        ],
        "title": "blah blah 17\" Frame",
        "front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"    
      } 
    ]"""

x = json.loads(x)

f = csv.writer(open("example.csv", "wb+"))

f.writerow(["Handle","Title","Body (HTML)", "Vendor","Type","Tags","Published","Option1 Name","Option1 Value","Variant Inventory Qty","Variant Inventory Policy","Variant Fulfillment Service","Variant Price","Variant Requires Shipping","Variant Taxable","Image Src"])

    for x in x:

        allTags = "\"" + ','.join(x["tags"]) + "\""
        images = x["photos"]
        f.writerow([x["title"],
                    x["title"],
                    x["description"],
                    "Vendor Name",
                    "Widget",
                    allTags,
                    "TRUE",
                    "Title",
                    "Default Title",
                    "1",
                    "deny",
                    "manual",
                    x["price"],
                    "TRUE",
                    "TRUE",
                    images.pop(0) if images else None])
        while images:
            f.writerow([x["title"],None,None,None,None,None,None,None,None,None,None,None,None,None,None,images.pop(0)])

ERROR MESSAGE: Full traceback that I see: Traceback (most recent call last):

Traceback (most recent call last): File "runnit2.py", line 976, in <module> allTags = "\"" + ','.join(x["tags"]) + "\"" TypeError: sequence item 3: expected string or Unicode, NoneType found

UPDATE: I've identified that the data, specifically [x["title"], x["title"],x["description"], has some characters that the code doesn't like. 'ascii' codec can't encode character u'\u201d' in position 9: ordinal not in range(128). I've done a quick fix with x["description"].encode('utf-8'), etc., but it pretty much eliminates everything that's in that cell. Is there is a better way which doesn't delete everything after offending character?

Can you also add your JSON? Errors suggest the problem might be there. — hgazibara
– hgazibara, Commented Jul 6, 2018 at 18:57
More generally, we need a minimal reproducible example before we can debug your problem. With what you've given us, all anyone can see is that there must be something wrong somewhere in some of your code or data, which isn't going to help you very much. — abarnert
– abarnert, Commented Jul 6, 2018 at 19:01
@hgazibara I've included a close example of the JSON, which I think is representative of the most convoluted parts of the JSON with strange characters etc. — maudulus
– maudulus, Commented Jul 6, 2018 at 19:23
Can't reproduce. That loads fine for me (python 3.6.2). What version of Python are you using? — glibdud
– glibdud, Commented Jul 6, 2018 at 19:26

Marlon Abeykoon · Accepted Answer · 2018-07-10 08:56:14Z

3

+50

From your posted sample data, I assume that the 1st index of the posted json has a null in the 3rd index of the values of tag key. i.e: tag7

"tags": [
          "tag4",
          "tag5",
          "tag6",
          "tag7",
          "tag8"
        ],

To get rid of the TypeError that raises due to nulls you can simply check and replace the nulls if they exist as shown below.

x["tags"] = ["" if i is None else i for i in x["tags"]]
allTags = "\"" + ','.join(x["tags"]) + "\""

I have assigned an empty string to replace nulls.

Alternatively you can remove all the false elements by using None in the filter() function.

allTags = "\"" + ','.join(filter(None, x["tags"])) + "\""

NOTE: Add r"[...]" and fix the indentation issue in the for loop.

answered Jul 10, 2018 at 8:56

Marlon Abeykoon

12.6k4 gold badges60 silver badges78 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

igrinis · Accepted Answer · 2018-07-11 19:21:11Z

1

Use raw string and set file encoding to utf-8 in normal (non-binary mode) mode when opening. For Python 3.6 it will be enough.

On Python 2.7 you should use codecs.open('example.csv', 'w', encoding='utf-8') instead of regular open() when dealing with unicode content. Also, csv module on Python 2.7 does not support unicode out of the box, so I suggest switching to unicodecsv or following the guidelines in this answer.

answered Jul 11, 2018 at 19:21

igrinis

14k27 silver badges53 bronze badges

Comments

huifer · Accepted Answer · 2018-07-09 09:26:34Z

Modify reading and writing using W If you must use WB, use the following functions. You need to add r in front of all texts to handle special symbols.

import csv
import json

x = r"""[
      {
        "brief": "Brief 1",
        "description": "Description 1",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
        ],
        "price": "145",
        "tags": [
          "tag1",
          "tag2",
          "tag3"
        ],
        "title": "Title 1"
      },
      {
        "brief": "Brief 2",
        "description": "Description 2",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
        ],
        "price": "150",
        "tags": [
          "tag4",
          "tag5",
          "tag6",
          "tag7",
          "tag8"
        ],
        "title": "Title 2"
      },{
        "brief": "blah blah 5'0\" to 5'4\"",
        "buyerPickup": true,
        "condition": "Good",
        "coverShipping": false,    
        "description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n  \r\n\r\n",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
        ],
        "price": "240",
        "tags": [
          "tag2",
          "5'0\"-5'4\""
        ],
        "title": "blah blah 17\" Frame",
        "front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"    
      } 
    ]"""

x = json.loads(x)


def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value


def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str

    return value


f = csv.writer(open("example.csv", "w+"))
writeList = ["Handle", "Title", "Body (HTML)", "Vendor", "Type", "Tags", "Published", "Option1 Name", "Option1 Value",
             "Variant Inventory Qty", "Variant Inventory Policy", "Variant Fulfillment Service", "Variant Price",
             "Variant Requires Shipping", "Variant Taxable", "Image Src"]
newList = []
for item in writeList:
    newList.append(to_bytes(item))

f.writerow(newList)

for x in x:

    allTags = r"\"" + ','.join(x["tags"]) + r"\""
    images = x["photos"]
    f.writerow([x["title"],
                x["title"],
                x["description"],
                "Vendor Name",
                "Widget",
                allTags,
                "TRUE",
                "Title",
                "Default Title",
                "1",
                "deny",
                "manual",
                x["price"],
                "TRUE",
                "TRUE",
                images.pop(0) if images else None])
    while images:
        f.writerow([x["title"], None, None, None, None, None, None, None, None, None, None, None, None, None, None,
                    images.pop(0)])

This does not address the issue pointed out in the question which is the TypeError

Souradeep Nanda · Accepted Answer · 2018-07-15 17:36:49Z

0

Possible duplicate of this question how to convert characters like \x22 into string

On cleaning the code the error boils down to

import json

x = '''
  {
    "brief": "\""
  }'''

x = json.loads(x)

Consider replacing \" with \u201d

import json

x = '{"brief": "\u201d"}'

x = json.loads(x)

answered Jul 15, 2018 at 17:36

Souradeep Nanda

3,3482 gold badges35 silver badges50 bronze badges

Collectives™ on Stack Overflow

Valid JSON Load in Python file

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related