1

I'm using pandas in python to take a csv file, do some minor transformations on it and then outputting the two columns as a json file. I want two values timestamp and value. I only want the two new columns and to drop the rest of the file so that it looks like:

{"timestamp[0]":value[0],"timestamp[1]":value[1],"timestamp[2]":value[2],..}

But right now with my code I'm still getting all the old csv data and the part I want is appended (and in the format) : {stuff I don't want, "timestamp":"timestamp[0]", "value":value[0]},{...}{...}

Here's the code I'm using currently:

import csv
import pandas as pd
import delorean as dl

def doThings(infile, outfile):
    f = pd.read_csv(infile)
    hmCols = {"timestamp": [], "value": []}

    for i, row in f.iterrows():
        total = row["Playspace_1"] + row["Playspace_2"] + row["Playspace_3"] + row["Playspace_4"]
        hmCols["timestamp"].append(row["Timestamp"])
        hmCols["value"].append(total)

    f["timestamp"] = hmCols["timestamp"] #old code
    f["value"] = hmCols["value"] #old code
    f.to_json(outfile, orient="records") #old code

    pd.DataFrame(hmCols).to_json(outfile, orient="records") #From user Turn

doThings("test.csv", "heatmapData.json")

Any help would be appreciated

So based on Turn's help I changed the code per his suggestion. Now I get the output:

[{"timestamp":1417982808063,"value":1},{"timestamp":1417982808063,"value":1},{"timestamp":1417982808753,"value":1},{"timestamp":1417982811944,"value":1} ...

Now I need to transform that to:

[{"1417982808063":1},{"1417982808063":1},{"1417982808753":1},{"1417982811944":1}...]

1 Answer 1

1

What if you change this:

        f["timestamp"] = hmCols["timestamp"]
        f["value"] = hmCols["value"]
        f.to_json(outfile, orient="records")

to:

        pd.DataFrame(hmCols).to_json(outfile, orient="records")

Edit to add:

I misunderstood the output you were looking for. What if you changed the whole loop to this (with an import json added at the top):

    def doThings(infile, outfile):
        f = pd.read_csv(infile)
        result = []
        for i, row in f.iterrows():
            total = row["Playspace_1"] + row["Playspace_2"] + row["Playspace_3"] + row["Playspace_4"]
            result.append({row["Timestamp"]: total})

        with open(outfile, 'w') as fp:
            json.dump(result, fp)
Sign up to request clarification or add additional context in comments.

2 Comments

So that gets me very close! I now have [{"timestamp":1417982808063,"value":1},{"timestamp":1417982808063,"value":1},{"timestamp":1417982808753,"value":1},{"timestamp":1417982811944,"value":1}
I misunderstood the output format you were looking for. Answer updated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.