2

I have a matching algorithm which links students to projects. It's working, and I have trouble exporting the data to a csv file. It only takes the last value and exports that only, when there are 200 values to be exported.

The data that's exported uses each number as a value when I would like to get the whole 's' rather than the three 3 numbers which make up 's', which are split into three columns. I've attached the images below. Any help would be appreciated.

What it looks like

What it should look like

#Imports for Pandas

import pandas as pd
from pandas import DataFrame 

SPA()
for m in M:
   s = m['student']
   l = m['lecturer']
   Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
   id = m['projectid']
   p = Project[id]['title']
   c = Project[id]['sourceid']
   r = str(getRank("Single_Projects1copy.csv",s,c))


   print(s+","+l+","+p+","+c+","+r)

   dataPack = (s+","+l+","+p+","+c+","+r)

   df = pd.DataFrame.from_records([dataPack])
   df.to_csv('try.csv')

1 Answer 1

1

You keep overwriting in the loop so you only end up with the last bit of data, you need to append to the csv with df.to_csv('try.csv',mode="a",header=False) or create one df and append to that and write outside the loop, something like:

df = pd.DataFrame()
for m in M:
   s = m['student']
   l = m['lecturer']
   Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
   id = m['projectid']
   p = Project[id]['title']
   c = Project[id]['sourceid']
   r = str(getRank("Single_Projects1copy.csv",s,c))


   print(s+","+l+","+p+","+c+","+r)

   dataPack = (s+","+l+","+p+","+c+","+r)

   df.append(pd.DataFrame.from_records([dataPack]))
df.to_csv('try.csv') # write all data once outside the loop

A better option would be to open a file and pass that file object to to_csv:

with open('try.csv', 'w') as f:
    for m in M:
       s = m['student']
       l = m['lecturer']
       Lecturer[l]['limit'] = Lecturer[l]['limit'] - 1
       id = m['projectid']
       p = Project[id]['title']
       c = Project[id]['sourceid']
       r = str(getRank("Single_Projects1copy.csv",s,c))
       print(s+","+l+","+p+","+c+","+r)

       dataPack = (s+","+l+","+p+","+c+","+r)
       pd.DataFrame.from_records([dataPack]).to_csv(f, header=False)

You get individual chars because you are using from_records passing a single string dataPack as the value so it iterates over the chars:

In [18]: df = pd.DataFrame.from_records(["foobar,"+"bar"])

In [19]: df
Out[19]: 
   0  1  2  3  4  5  6  7  8  9
0  f  o  o  b  a  r  ,  b  a  r

In [20]: df = pd.DataFrame(["foobar,"+"bar"])

In [21]: df
Out[21]: 
            0
0  foobar,bar

I think you basically want to leave as a tuple dataPack = (s, l, p,c, r) and use pd.DataFrame(dataPack). You don't really need pandas at all, the csv lib would do all this for you without needing to create Dataframes.

Sign up to request clarification or add additional context in comments.

5 Comments

The open a file worked, it displays the data of all the students in the csv. Thanks for your input, appreciate it. In the csv it skips the header, but the first columns consists of a 0. I will have to make changes to make the column structure right.
I've been instructed to use Pandas, so if in the future the data needs to be exported to MySQL it would be easier.
Do you want to use the csv header from the file or create your own
Cheers! I got it in the same way as I wanted it to be! Would you recommend sticking with Pandas if the data in the future needs to be exported to MySQL.
It depends really on what you were doing but it is pretty easy dump a csv into a new table/db or add to an existing without needing pandas.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.