1

My objective is to..

  • if the dataframe is empty, i need to insert a row with index->value of the variable URL and columns-> value of URL along with the sorted_list
  • if non-empty, i need to insert a row with index->value of the variable URL and columns->sorted_list

What I did was... I initialized a DataFrame self.pd and then for each row with values as above said I created a local DataFrame variable df1 and append it to self.df.

My code:

import pandas as pd

class Reward_Matrix:
    def __init__(self):
        self.df = pd.DataFrame()

    def add(self, URL, webpage_list):
        sorted_list = []
        check_list = list(self.df.columns.values)
        print('check_list: ',check_list)
        for i in webpage_list:     #to ensure no duplication columns
            if i not in check_list:
                sorted_list.append(i)
        if self.df.empty:
            sorted_list.insert(0, URL)
            df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
        else:
            df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
        print(df1)
        print('sorted_list: ',sorted_list)
        print("length: ",len(df1.columns))
        self.df.append(df1)

But I get the following error:

Traceback (most recent call last):
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4294, in create_block_manager_from_blocks
placement=slice(0, len(axes[0])))]
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 2719, in make_block
return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 115, in __init__
len(self.mgr_locs)))
ValueError: Wrong number of items passed 1, placement implies 450

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "...eclipse-workspace\Crawler\crawl_core\src_main\run.py", line 23, in test_start
test.crawl_run(self.URL)
  File "...eclipse-workspace\Crawler\crawl_core\src_main\test_crawl.py", line 42, in crawl_run
self.reward.add(URL, webpage_list)
  File "...eclipse-workspace\Crawler\crawl_core\src_main\dynamic_matrix.py", line 21, in add
df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 352, in __init__
copy=False)
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 483, in _init_ndarray
return create_block_manager_from_blocks([values], [columns, index])
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4303, in create_block_manager_from_blocks
construction_error(tot_items, blocks[0].shape[1:], axes, e)
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4280, in construction_error
passed, implied))
ValueError: Shape of passed values is (1, 1), indices imply (450, 1)

I am not well-versed with DataFrame and Pandas. I had been getting this error for quite some time and I am getting confused when I go through similar questions asked in StackOverflow as I can't understand where I went wrong!

Can someone help me out?

2
  • @jezrael ee! I think that's correct. A dataframe df1 with index URL and columns as specified... Shouldn't index and columns must be specified as a collection? and the 0 -> I wanted all the values of df1 to be 0. Is there something wrong? like syntax? Commented Nov 3, 2017 at 6:53
  • I add sample, give me a sec. Commented Nov 3, 2017 at 6:54

1 Answer 1

1

I think you need remove [], because else get nested list:

df1 = pd.DataFrame(0,index=[URL], columns=sorted_list)

Sample:

sorted_list = ['a','b','c']
URL = 'url1'
df1 = pd.DataFrame(0,index=[URL], columns=sorted_list)
print (df1)
      a  b  c
url1  0  0  0

df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
print (df1)

>ValueError: Shape of passed values is (1, 1), indices imply (3, 1)
Sign up to request clarification or add additional context in comments.

10 Comments

Yes! U were right! That did remove the error. But now when I printed the DataFrame after self.df.append(df1) I get an empty one! Empty DataFrame Columns: [] Index: []. Won't appending work here?
Not 100% sure, but return self.df.append(df1) should work
because use DataFrame.append, not python append
Maybe need self.df = self.df.append(df1) - assign back
Urghhh! That was it! Thanks and that did the job! jeez i missed it!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.