2

I'm trying to download a json file, save the file, and iterate through the json file in order to extract all information and save it is variables. I'm then going to format a message in csv format to send the data to another system. My problem is the json data. It appears to be a dictionary within a list and I'm not sure how to process it.

Here's the json:

[ {
  "ipAddress" : "",
  "feedDescription" : "Botted Node Feed",
  "bnFeedVersion" : "1.1.4",
  "generatedTs" : "2013-08-01 12:00:10.360+0000",
  "count" : 642903,
  "firstDiscoveredTs" : "2013-07-21 19:07:20.627+0000",
  "lastDiscoveredTs" : "2013-08-01 00:34:41.052+0000",
  "threatType" : "BN",
  "confidence" : 82,
  "discoveryMethod" : "spamtrap",
  "indicator" : true,
  "supportingData" : {
    "behavior" : "spamming",
    "botnetName" : null,
    "spamtrapData" : {
      "uniqueSubjectCount" : 88
    },
    "p2pData" : {
      "connect" : null,
      "port" : null
    }
  }
}, {
  "ipAddress" : "",
  "feedDescription" : "Botted Node Feed",
  "bnFeedVersion" : "1.1.4",
  "generatedTs" : "2013-08-01 12:00:10.360+0000",
  "count" : 28,
  "firstDiscoveredTs" : "2013-07-19 03:19:08.622+0000",
  "lastDiscoveredTs" : "2013-08-01 01:44:04.009+0000",
  "threatType" : "BN",
  "confidence" : 40,
  "discoveryMethod" : "spamtrap",
  "indicator" : true,
  "supportingData" : {
    "behavior" : "spamming",
    "botnetName" : null,
     "spamtrapData" : {
      "uniqueSubjectCount" : 9
    },
    "p2pData" : {
      "connect" : null,
      "port" : null
    }
  }
}, {
  "ipAddress" : "",
  "feedDescription" : "Botted Node Feed",
  "bnFeedVersion" : "1.1.4",
  "generatedTs" : "2013-08-01 12:00:10.360+0000",
  "count" : 160949,
  "firstDiscoveredTs" : "2013-07-16 18:52:33.881+0000",
  "lastDiscoveredTs" : "2013-08-01 03:14:59.452+0000",
  "threatType" : "BN",
  "confidence" : 82,
   "discoveryMethod" : "spamtrap",
  "indicator" : true,
  "supportingData" : {
    "behavior" : "spamming",
    "botnetName" : null,
    "spamtrapData" : {
      "uniqueSubjectCount" : 3
    },
     "p2pData" : {
      "connect" : null,
       "port" : null
    }
  }
 } ]

My code:

download = 'https:URL.BNfeed20130801.json'

request = requests.get(download, verify=False)
out  = open(fileName, 'w')
for row in request:
    if row.strip():
         for column in row:
                 out.write(column)
    else:
        continue
out.close()
time.sleep(4)
jsonRequest = request.json()

for item in jsonRequest:
     print jsonRequest[0]['ipAddress']
     print jsonRequest[item]['ipAddress'] --I also tried this

When I do the above it just prints the same IP over and over again. I've put in the print statement for testing purposes only. Once I figure out to to access the different elements of the JSON I will store it in variables and then use these variables accordingly. Any help is greatly appreciated.

Thanks in advance for any help. I'm using Python 2.6 on Linux.

1
  • item isn't an index; it's a dict from the list of dicts. Commented Aug 19, 2013 at 19:35

2 Answers 2

4

You are basically iterating over list of dicts, try just item['ipAddress'].

Sign up to request clarification or add additional context in comments.

4 Comments

I tried that as you suggested and I get: Traceback (most recent call last): File "cyveillance.py", line 53, in <module> print jsonRequest['ipAddress'] TypeError: list indices must be integers, not str
Use print item['ipAddress'] instead of print jsonRequest['ipAddress'].
@user2697611 why are you using print jsonRequest['ipAddress']? I've suggested to try print item['ipAddress'].
I misunderstood, using for item in jsonRequest: print item['ipAddress'] appears to be working
2

alecxe's answer tells you how to fix this, but let me try to explain what's wrong with the original code.


It may be easier to understand with a simpler example, once you can run through an interactive visualizer:

a = ['a', 'b', 'c']

When you do this:

for item in a:

item will be 'a' the first time through, then 'b', then 'c'.

But if you do this:

for item in a:
    print a[0]

… you're completely ignoring item. It's just going to print a 3 times, because each time you go through the loop, you're just asking for a[0]—that is, the first thing in a.

And if you do this:

for item in a:
    print a[item]

… it's going to raise an exception, because you're asking for the 'a'th thing in the list, which is nonsense.

But in this code:

for item in a:
    print item

… you'll print 'a', then 'b', then 'c', which is exactly what you want.

You could also do this:

for index, item in enumerate(a):
    print a[index]

… but that's silly. If you need the index, use enumerate, but if you just need the item itself… you've already got it.


So, back to your real code:

for item in jsonRequest:
     print jsonRequest[0]['ipAddress']

Again, you're ignoring item and asking for jsonRequest[0] each time.

And in this code:

for item in jsonRequest:
    print jsonRequest[item]['ipAddress'] --I also tried this

… you're asking for the {complicated dictionary}th thing in jsonRequest, which is again nonsense.

But in this code:

for item in jsonRequest:
    print item['ipAddress']

You're using each item, just as in the simple example.

3 Comments

Awesome, thanks for the explanation. That makes a lot of sense. How then do I go about access the directories within the directories? I was trying to do this behavior = item['supportingData']['behavior'] but that doesn't appear to work.
@user2697611: That should work. When I change the line to print item['supportingData']['behavior'] and use your sample JSON instead of pulling stuff down with requests, it prints spamming` three times in a row, just as it should. If it's not working for you, there's something wrong in code we can't see. You should probably accept alecxe's answer, then create a new question, with an SSCCE for the new problem.
I actually tried it again and it seems to be working now. I'm not sure what went wrong the first time. Thanks for your comments. I appreciate everyone's help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.