0

I have a list of values that I want to update into multiple columns, this is fine for a single row. However when I try to update over multiple rows it simply overrides the whole column with the last value.

List for each row looks like below (note: list length is of variable size):

['2016-03-16T09:53:05',
 '2016-03-16T16:13:33',
 '2016-03-17T13:30:31',
 '2016-03-17T13:39:09',
 '2016-03-17T16:59:01',
 '2016-03-23T12:20:47',
 '2016-03-23T13:22:58',
 '2016-03-29T17:26:26',
 '2016-03-30T09:08:17']

I can store this in empty columns by using:

for i in range(len(trans_dates)):
    df[('T' + str(i + 1) + ' - Date')] = trans_dates[i]

However this updates the whole column with the single trans_dates[i] value

I thought looping over each row with the above code would work but it still overwrites.

for issues in all_issues:
    for i in range(len(trans_dates)):
        df[('T' + str(i + 1) + ' - Date')] = trans_dates[i]
  • How do I only update my current row in the loop?
  • Am I even going about this the right way? Or is there a faster vectorised way of doing it?

Full code snippet below:

for issues in all_issues:
    print(issues)
    changelog = issues.changelog
    trans_dates = []
    from_status = []
    to_status = []
    for history in changelog.histories:
        for item in history.items:
            if item.field == 'status':
                trans_dates.append(history.created[:19])
                from_status.append(item.fromString)
                to_status.append(item.toString)
    trans_dates = list(reversed(trans_dates))
    from_status = list(reversed(from_status))
    to_status = list(reversed(to_status))
    print(trans_dates)

    # Store raw data in created columns and convert dates to pd.to_datetime
    for i in range(len(trans_dates)):
        df[('T' + str(i + 1) + ' - Date')] = trans_dates[i]
    for i in range(len(to_status)):
        df[('T' + str(i + 1) + ' - To')] = to_status[i]
    for i in range(len(from_status)):
        df[('T' + str(i + 1) + ' - From')] = from_status[i]
    for i in range(len(trans_dates)):
        df['T' + str(i + 1) + ' - Date'] = pd.to_datetime(df['T' + str(i + 1) + ' - Date'])
  • EDIT: Sample input and output added.

input: issue/row #1 list (note year changes):

    ['2016-03-16T09:53:05',
     '2016-03-16T16:13:33',
     '2016-03-17T13:30:31',
     '2016-03-17T13:39:09']

issue #2

['2017-03-16T09:53:05',
 '2017-03-16T16:13:33',
 '2017-03-17T13:30:31']

issue #3

['2018-03-16T09:53:05',
 '2018-03-16T16:13:33',
 '2018-03-17T13:30:31']

issue #4

['2015-03-16T09:53:05',
 '2015-03-16T16:13:33']

output:

        col       T1                     T2                      T3                 T4
        17 '2016-03-16T09:53:05'   '2016-03-16T16:13:33'  '2016-03-17T13:30:31'  '2016-03-17T13:30:31'
        18 '2017-03-16T09:53:05'   '2017-03-16T16:13:33'  '2017-03-17T13:30:31'  np.nan
        19 '2018-03-16T09:53:05'   '2018-03-16T16:13:33'  '2018-03-17T13:30:31' np.nan
        20 '2015-03-16T09:53:05'   '2015-03-16T16:13:33'      np.nan     np.nan
2
  • Can you post a sample input and expected output? Commented Jun 26, 2018 at 2:21
  • Added to main question. Commented Jun 26, 2018 at 4:17

1 Answer 1

1

Instead of this:

for i in range(len(trans_dates)):
    df[('T' + str(i + 1) + ' - Date')] = trans_dates[i]

Try this:

for i in range(len(trans_dates)):
    df.loc[i, ('T' + str(i + 1) + ' - Date')] = trans_dates[i]  

There are probably better ways to do this... df.merge or df.replace come to mind... it would be helpful if you posted what the input dataframe looked like and what the expected result is.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.