2

I'm trying to iterate through a two dimensional array in Python and compare items in the array to ints, however I am faced with a ton of various errors whenever I attempt to do such. I'm using numpy and pandas.

My dataset is created as follows:

filename = "C:/Users/User/My Documents/JoeTest.csv"
datas = pandas.read_csv(filename)
dataset = datas.values

Then, I attempt to go through the data, grabbing certain elements of it.

def model_building(data):
global blackKings
flag = 0;
blackKings.append(data[0][1])
for i in data:
    if data[i][39] == 1:
        if data[i][40] == 1:
            values.append(1)
        else:
            values.append(-1)
    else:
        if data[i][40] == 1:
            values.append(-1)
        else:
            values.append(1)
    for j in blackKings:
        if blackKings[j] != data[i][1]:
            flag = 1
    if flag == 1:
        blackKings.append(data[i][1])
        flag = 0;

However, doing so leaves me with a ValueError: The Truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). I don't want to use either of these, as I'm looking to compare the actual value of that one specific instance. Is there another way around this problem?

5
  • 2
    You can help us by posting the exact error and the full stack trace. Commented Apr 20, 2017 at 5:19
  • 1
    If you create a Minimal, Complete, and Verifiable example it makes it easier for us to help you. Commented Apr 20, 2017 at 5:21
  • This error usually arises in if statements. You should check the ``if data[i][39] == 1:` statements. My guess is that data[i][39] returns a index, value pair. Just print it out and check. Also always try to use data.loc[39,i] instead. Commented Apr 20, 2017 at 5:36
  • Why do you need to use dataframe.values? Try iterating through dataframe rows` 'for row in dataframe.iterrows()` Commented Apr 20, 2017 at 5:55
  • Never use global, unless you really really really need to. Commented Apr 20, 2017 at 9:41

2 Answers 2

2

You need to tell us something about this: dataset = datas.values

It's probably a 2d array, since it derives from a load of a csv. But what shape and dtype? Maybe even a sample of the array.

Is that the data argument in the function?

What are blackKings and values? You treat them like lists (with append).

for i in data:
    if data[i][39] == 1:

This doesn't make sense. for i in data, if data is 2d, i is the the first row, then the second row, etc. If you want i to in an index, you use something like

for i in range(data.shape[0]):

2d array indexing is normally done with data[i,39].

But in your case data[i][39] is probably an array.

Anytime you use an array in a if statement, you'll get this ValueError, because there are multiple values.

If i were proper indexes, then data[i,39] would be a single value.

To illustrate:

In [41]: data=np.random.randint(0,4,(4,4))
In [42]: data
Out[42]: 
array([[0, 3, 3, 2],
       [2, 1, 0, 2],
       [3, 2, 3, 1],
       [1, 3, 3, 3]])
In [43]: for i in data:
    ...:     print('i',i)
    ...:     print('data[i]',data[i].shape)
    ...:     
i [0 3 3 2]            # 1st row
data[i] (4, 4)
i [2 1 0 2]            # a 4d array
data[i] (4, 4)
...

Here i is a 4 element array; using that to index data[i] actually produces a 4 dimensional array; it isn't selecting one value, but rather many values.

Instead you need to iterate in one of these ways:

In [46]: for row in data:
    ...:     if row[3]==1:
    ...:         print(row)
[3 2 3 1]
In [47]: for i in range(data.shape[0]):
    ...:     if data[i,3]==1:
    ...:         print(data[i])
[3 2 3 1]

To debug a problem like this you need to look at intermediate values, and especially their shapes. Don't just assume. Check!

Sign up to request clarification or add additional context in comments.

Comments

0

I'm going to attempt to rewrite your function

def model_building(data):
    global blackKings
    blackKings.append(data[0, 1])

    # Your nested if statements were performing an xor
    # This is vectorized version of the same thing
    values = np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1

    # not sure where `values` is defined.  If you really wanted to
    # append to it, you can do
    # values = np.append(values, np.logical_xor(*(data.T[[39, 40]] == 1)) * -2 + 1)

    # Your blackKings / flag logic can be reduced
    mask = (blackKings[:, None] != data[:, 1]).all(1)
    blackKings = np.append(blackKings, data[:, 1][mask])

This may not be perfect because it is difficult to parse your logic considering you are missing some pieces. But hopefully you can adopt some of what I've included here and improve your code.

4 Comments

Stay away from np.append. It is evil.
@hpaulj no, seriously?! difficult to deal with? or broken?
np.append works, but people misuse it in many ways. You, for example, appear to think it works in-place like the list append. It doesn't. It's just a cover function for np.concatenate.
@hpaulj dangit! I never use it like that... that was a typo/bug. But I get your point.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.