0

Running into a problem with my JSON:

First issue was that SyntaxError: Non-ASCII character '\xe2' in file so I added # -*- coding: utf-8 -*- at the top of my file.

Then the problem became a problem where I load my JSON x = json.loads(x): ValueError: Expecting , delimiter: line 3 column 52 (char 57). I referenced this stackoverflow solution and so added an r in front of my JSON:

x = r"""[
  { my validated json... }
]"""

But then I get an error TypeError: sequence item 3: expected string or Unicode, NoneType found - I think it that the r is throwing it off somehow?

JSON Resembles the following:

[
  {
    "brief": "Brief 1",
    "description": "Description 1",
    "photos": [
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
    ],
    "price": "145",
    "tags": [
      "tag1",
      "tag2",
      "tag3"
    ],
    "title": "Title 1"
  },
  {
    "brief": "Brief 2",
    "description": "Description 2",
    "photos": [
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
    ],
    "price": "150",
    "tags": [
      "tag4",
      "tag5",
      "tag6",
      "tag7",
      "tag8"
    ],
    "title": "Title 2"
  },{
    "brief": "blah blah 5'0\" to 5'4\"",
    "buyerPickup": true,
    "condition": "Good",
    "coverShipping": false,    
    "description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n  \r\n\r\n",
    "photos": [
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
      "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
    ],
    "price": "240",
    "tags": [
      "tag2",
      "5'0\"-5'4\""
    ],
    "title": "blah blah 17\" Frame",
    "front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"    
  } 
]

CURRENT CODE

# -*- coding: utf-8 -*-

import csv
import json

x = """[
      {
        "brief": "Brief 1",
        "description": "Description 1",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
        ],
        "price": "145",
        "tags": [
          "tag1",
          "tag2",
          "tag3"
        ],
        "title": "Title 1"
      },
      {
        "brief": "Brief 2",
        "description": "Description 2",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
        ],
        "price": "150",
        "tags": [
          "tag4",
          "tag5",
          "tag6",
          "tag7",
          "tag8"
        ],
        "title": "Title 2"
      },{
        "brief": "blah blah 5'0\" to 5'4\"",
        "buyerPickup": true,
        "condition": "Good",
        "coverShipping": false,    
        "description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n  \r\n\r\n",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
        ],
        "price": "240",
        "tags": [
          "tag2",
          "5'0\"-5'4\""
        ],
        "title": "blah blah 17\" Frame",
        "front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"    
      } 
    ]"""

x = json.loads(x)

f = csv.writer(open("example.csv", "wb+"))

f.writerow(["Handle","Title","Body (HTML)", "Vendor","Type","Tags","Published","Option1 Name","Option1 Value","Variant Inventory Qty","Variant Inventory Policy","Variant Fulfillment Service","Variant Price","Variant Requires Shipping","Variant Taxable","Image Src"])

    for x in x:

        allTags = "\"" + ','.join(x["tags"]) + "\""
        images = x["photos"]
        f.writerow([x["title"],
                    x["title"],
                    x["description"],
                    "Vendor Name",
                    "Widget",
                    allTags,
                    "TRUE",
                    "Title",
                    "Default Title",
                    "1",
                    "deny",
                    "manual",
                    x["price"],
                    "TRUE",
                    "TRUE",
                    images.pop(0) if images else None])
        while images:
            f.writerow([x["title"],None,None,None,None,None,None,None,None,None,None,None,None,None,None,images.pop(0)])

ERROR MESSAGE: Full traceback that I see: Traceback (most recent call last):

Traceback (most recent call last): File "runnit2.py", line 976, in <module> allTags = "\"" + ','.join(x["tags"]) + "\"" TypeError: sequence item 3: expected string or Unicode, NoneType found

UPDATE: I've identified that the data, specifically [x["title"], x["title"],x["description"], has some characters that the code doesn't like. 'ascii' codec can't encode character u'\u201d' in position 9: ordinal not in range(128). I've done a quick fix with x["description"].encode('utf-8'), etc., but it pretty much eliminates everything that's in that cell. Is there is a better way which doesn't delete everything after offending character?

19
  • 4
    Can you also add your JSON? Errors suggest the problem might be there. Commented Jul 6, 2018 at 18:57
  • 4
    More generally, we need a minimal reproducible example before we can debug your problem. With what you've given us, all anyone can see is that there must be something wrong somewhere in some of your code or data, which isn't going to help you very much. Commented Jul 6, 2018 at 19:01
  • @hgazibara I've included a close example of the JSON, which I think is representative of the most convoluted parts of the JSON with strange characters etc. Commented Jul 6, 2018 at 19:23
  • Can't reproduce. That loads fine for me (python 3.6.2). What version of Python are you using? Commented Jul 6, 2018 at 19:26
  • @glibdud Python 2.7.10 Commented Jul 6, 2018 at 19:27

4 Answers 4

3
+50

From your posted sample data, I assume that the 1st index of the posted json has a null in the 3rd index of the values of tag key. i.e: tag7

"tags": [
          "tag4",
          "tag5",
          "tag6",
          "tag7",
          "tag8"
        ],

To get rid of the TypeError that raises due to nulls you can simply check and replace the nulls if they exist as shown below.

x["tags"] = ["" if i is None else i for i in x["tags"]]
allTags = "\"" + ','.join(x["tags"]) + "\""

I have assigned an empty string to replace nulls.

Alternatively you can remove all the false elements by using None in the filter() function.

allTags = "\"" + ','.join(filter(None, x["tags"])) + "\""

NOTE: Add r"[...]" and fix the indentation issue in the for loop.

Sign up to request clarification or add additional context in comments.

Comments

1

Use raw string and set file encoding to utf-8 in normal (non-binary mode) mode when opening. For Python 3.6 it will be enough.

On Python 2.7 you should use codecs.open('example.csv', 'w', encoding='utf-8') instead of regular open() when dealing with unicode content. Also, csv module on Python 2.7 does not support unicode out of the box, so I suggest switching to unicodecsv or following the guidelines in this answer.

Comments

0

Modify reading and writing using W If you must use WB, use the following functions. You need to add r in front of all texts to handle special symbols.

import csv
import json

x = r"""[
      {
        "brief": "Brief 1",
        "description": "Description 1",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example2.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example3.jpg?0101010101010"
        ],
        "price": "145",
        "tags": [
          "tag1",
          "tag2",
          "tag3"
        ],
        "title": "Title 1"
      },
      {
        "brief": "Brief 2",
        "description": "Description 2",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example4.jpg?0101010101010",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-example5.jpg?0101010101010"
        ],
        "price": "150",
        "tags": [
          "tag4",
          "tag5",
          "tag6",
          "tag7",
          "tag8"
        ],
        "title": "Title 2"
      },{
        "brief": "blah blah 5'0\" to 5'4\"",
        "buyerPickup": true,
        "condition": "Good",
        "coverShipping": false,    
        "description": "blah blah 5'0\" to 5'4\". blah blah.Size L/20”\n 5’8-5’11\n29lbs\n3x7 speed\n\n  \r\n\r\n",
        "photos": [
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-010101.jpeg?11111",
          "https://cdn.shopify.com/s/files/1/01/01/01/files/imgs-020202?111111"
        ],
        "price": "240",
        "tags": [
          "tag2",
          "5'0\"-5'4\""
        ],
        "title": "blah blah 17\" Frame",
        "front": "https://firebasestorage.googleapis.com/v0/b/example.appspot.com/o/Images%2F0007891113.jpg?alt=media&token=111-11-11-11-111"    
      } 
    ]"""

x = json.loads(x)


def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value


def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str

    return value


f = csv.writer(open("example.csv", "w+"))
writeList = ["Handle", "Title", "Body (HTML)", "Vendor", "Type", "Tags", "Published", "Option1 Name", "Option1 Value",
             "Variant Inventory Qty", "Variant Inventory Policy", "Variant Fulfillment Service", "Variant Price",
             "Variant Requires Shipping", "Variant Taxable", "Image Src"]
newList = []
for item in writeList:
    newList.append(to_bytes(item))

f.writerow(newList)

for x in x:

    allTags = r"\"" + ','.join(x["tags"]) + r"\""
    images = x["photos"]
    f.writerow([x["title"],
                x["title"],
                x["description"],
                "Vendor Name",
                "Widget",
                allTags,
                "TRUE",
                "Title",
                "Default Title",
                "1",
                "deny",
                "manual",
                x["price"],
                "TRUE",
                "TRUE",
                images.pop(0) if images else None])
    while images:
        f.writerow([x["title"], None, None, None, None, None, None, None, None, None, None, None, None, None, None,
                    images.pop(0)])

1 Comment

This does not address the issue pointed out in the question which is the TypeError
0

Possible duplicate of this question how to convert characters like \x22 into string

On cleaning the code the error boils down to

import json

x = '''
  {
    "brief": "\""
  }'''

x = json.loads(x)

Consider replacing \" with \u201d

import json

x = '{"brief": "\u201d"}'

x = json.loads(x)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.