-1

I just created a python program which scrapes google webmaster API to check if the target_site is mobile friendly or not & then based on response it extracts certain Json elements. Also it captures the screenshot on local folder

the script is working fine,BUT When i am trying to write those json objects to csv file ,its not working .

Here is my code :-

import requests, json, string, random, time
import csv
from base64 import decodestring
from random import randint

#links = open(r'D:\\Carlos\\Links.txt')
links = ['https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?key=AIzaSyDkEX-f1JNLQLC164SZaobALqFv4PHV-kA&screenshot=true&snapshots=true&locale=en_US&url=https://www.economicalinsurance.com/en/&strategy=mobile&filter_third_party_resources=false',
         'https://www.googleapis.com/pagespeedonline/v3beta1/mobileReady?key=AIzaSyDkEX-f1JNLQLC164SZaobALqFv4PHV-kA&screenshot=true&snapshots=true&locale=en_US&url=http://www.volkswagen-me.com/en-vwme/service/protection/motor-insurance.html&strategy=mobile&filter_third_party_resources=false']

def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
    return ''.join(random.choice(chars) for _ in range(size))

i = 12

def get_data(each):
    try:
        r = requests.get(each)
    except:
        pass
    #time.sleep(randint(1, 3))
    try:    
        json_data = json.loads(r.text)
    except:
        pass
    try:
        score = json_data['ruleGroups']['USABILITY']['score'];score=int(score)
    except:
        pass
    try:
        Pass = json_data['ruleGroups']['USABILITY']['pass'];Pass=str(Pass)
    except:
        pass
    try:
        ConfigureViewport = json_data['formattedResults']['ruleResults']['ConfigureViewport']['localizedRuleName'];ConfigureViewport=str(ConfigureViewport)
    except:
        pass
    try:
        UseLegibleFontSizes = json_data['formattedResults']['ruleResults']['UseLegibleFontSizes']['localizedRuleName'];UseLegibleFontSizes=str(UseLegibleFontSizes)
    except:
        pass
    try:
        AvoidPlugins = json_data['formattedResults']['ruleResults']['AvoidPlugins']['localizedRuleName'];AvoidPlugins=str(AvoidPlugins)
    except:
        pass
    try:
        SizeContentToViewport = json_data['formattedResults']['ruleResults']['SizeContentToViewport']['localizedRuleName'];SizeContentToViewport=str(SizeContentToViewport)
    except:
        pass
    try:
        SizeTapTargetsAppropriately = json_data['formattedResults']['ruleResults']['SizeTapTargetsAppropriately']['localizedRuleName'];SizeTapTargetsAppropriately=str(SizeTapTargetsAppropriately)
    except:
        pass
    try:
        AvoidInterstitials = json_data['formattedResults']['ruleResults']['AvoidInterstitials']['localizedRuleName'];AvoidInterstitials=str(AvoidInterstitials)
    except:
        pass
    try:
        image_link = json_data['screenshot']['data']; image_link = image_link.replace("_", "/").replace("-","+")
    except:
        pass
    #try:
    id_generator_name = "".join( [random.choice(string.letters) for i in xrange(15)] )+'.jpeg'
    #except:
       # pass
    #try:
    fh = open(id_generator_name, "wb")
    #except:
    #    pass
    try:
        fh.write(str(image_link).decode('base64'))
        time.sleep(1)
    except:
        pass
    try:
        fh.close()
    except:
        pass
    try:
        error_code = json_data['error']['message'];error_code=str(error_code)
    except:
        pass
    try:
        print each, score, Pass, ConfigureViewport, UseLegibleFontSizes, AvoidPlugins, SizeContentToViewport, SizeTapTargetsAppropriately, AvoidInterstitials, error_code
    except:
        pass
    try:
        writer.writerow({'each':each, 'score':score, 'Pass':Pass, 'ConfigureViewport':ConfigureViewport,
                         'UseLegibleFontSizes':UseLegibleFontSizes, 'AvoidPlugins':AvoidPlugins,
                         'SizeContentToViewport':SizeContentToViewport,'SizeTapTargetsAppropriately':SizeTapTargetsAppropriately,
                         'AvoidInterstitials':AvoidInterstitials, 'error_code':error_code,'imagename':id_generator_name})
    except:
        pass

#path to the csv file
with open("D:\Carlos\Data_file\output.csv", "ab")as export:
    fieldnames = ['each', 'score', 'Pass', 'ConfigureViewport', 'UseLegibleFontSizes', 'AvoidPlugins', 'SizeContentToViewport',
                  'SizeTapTargetsAppropriately', 'AvoidInterstitials', 'error_code','imagename']
    writer = csv.DictWriter(export, fieldnames=fieldnames)
    writer.writeheader()
    for each in links:
    #try:
        get_data(each)
    #except:
    #  pass

Please advice on how to write to csv ? Or where things are wrong in the code ?

11
  • 1
    I am more comfortable to work with CSV , As you can see in the above code , I able to assign variables to each of those JSON elements i want to parse , but when i try to write those variables into a csv file , it does not write anything & code exits with no errors . Commented Oct 23, 2015 at 14:22
  • 2
    Well you catch every possible exception for each try: except: block so it is understandable that no error is shown. You should catch explicit exceptions to not hide the unexpected ones. Commented Oct 23, 2015 at 14:28
  • 1
    Please explain what is not working. And it could be useful to add a debugging print in your script to show the dict your are passing to writer.writerow (print ({'each':each, 'score':score, ...})) Commented Oct 23, 2015 at 14:30
  • 4
    "where things are wrong in the code ?" - your error handling is, not to put too fine a point on it, completely useless. Please read blog.codekills.net/2011/09/29/the-evils-of--except-- and understand that the code as written is impossible to debug. Commented Oct 23, 2015 at 14:32
  • 2
    Please try your code with no try: except: blocks, you will understand what's going on. Commented Oct 23, 2015 at 14:35

1 Answer 1

2

I like to use Pandas dataframes for this, but it may be overkill if you wouldn't use Pandas otherwise. Pandas dataframes are also great for analysis and comparison.

You would put the JSON into a dataframe, and then output the dataframe to a CSV file.

import pandas as pd

df = pd.read_json('path/to/json/file')
df.to_csv('filename.csv')

Note that it's this simple only when your JSON has one level and might as well be a csv. Otherwise, you would need to read the JSON into a dict, navigate to the appropriate level and then read that into a dataframe.

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html

http://pandas.pydata.org/

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.