2

I've got this matrix of probabilities here and I'm trying to index them to get one of the probabilities in each row so I can log them.

p_matrix = 
[[0.5        0.5      ]
 [0.45384845 0.5461515 ]
 [0.45384845 0.5461515 ]
 [0.45384845 0.5461515 ]
 [0.48519668 0.51480335]
 [0.48257706 0.517423  ]
 [0.48257706 0.517423  ]
 [0.48257706 0.517423  ]
 [0.4807878  0.5192122 ]
 [0.45384845 0.5461515 ]
 [0.48257703 0.517423  ]]

The indexes are stored in a placeholder a = tf.placeholder(shape=None, dtype=tf.int32)

Normally I would simply do p_matrix[np.arange(a.shape[0], dtype=np.int32), a]

in order to grab the corresponding results but this gives me an error

IndexError: arrays used as indices must be of integer (or boolean) type

Using a standard numpy array in place of a gives me the desired result. I thought it might be something specific about using dtype=tf.int32 but I get the same result if I change the dtype of the placeholder to np.int32.

Also when I get the type of a it returns <class 'numpy.ndarray'> and for a[0] it returns <class 'numpy.int32'>.

Any ideas?

To summarize:

x = np.arange(a.shape[0])
y = np.array(list(a))

print(action_prob[x,y])  # This works.
print(action_prob[x,a])  # This does not work.

type(a) = <class 'numpy.ndarray'>
type(y) = <class 'numpy.ndarray'>

I can only assume it's because one is a tf.placeholder and as a result I can't specify this in the graph initialization?

EDIT:

Sample code:

class Model():
    def __init__(self, sess, s_size, game, lr=0.001):
        f_size = 12
        self.input = tf.placeholder(shape=[None, f_size], dtype=tf.float32)
        self.action = tf.placeholder(shape=None, dtype=tf.int32)

        self.p_matrix = tf.contrib.layers.fully_connected(self.state,
            20, activation_fn=tf.nn.softmax, biases_initializer=None)

        # Here I need to select the correct p_values
        self.log_prob = tf.log(self.action_prob[p_selected])

        self.train = tf.train.AdamOptimizer(lr).minimize(loss=-log_prob)

    def learn(self, s, a, td):
        # a = a.reshape(a.shape[0], 1)  # necessary for the episodes
        feed_dict = {self.input: s, self.action: a}
        p_matrix = self.sess.run(self.p_matrix, feed_dict)

        log_prob, p_matrix = self.sess.run([self.log_prob, self.p_matrix], feed_dict)

        _ = self.sess.run(self.train, feed_dict)

1 Answer 1

3

You can do that with tf.gather_nd:

idx = tf.stack([tf.range(tf.shape(a)[0], dtype=a.dtype), a], axis=1)
p_selected = tf.gather_nd(p_matrix, idx)

Each row in idx contains the "coordinates" of each element to retrieve, like [[0, a[0]], [1, a[1]], ...].

Alternatively batch_dims argument lets you omit those leading location dimensions from the idx

idx = tf.expand_dims(a, axis=1)
p_selected = tf.gather_nd(batch_dims=p_matrix, indices=idx, batch_dims=1)
Sign up to request clarification or add additional context in comments.

9 Comments

I think I see. But this doesn't seem like something I can put in my init. It seems like I have to do this outside, but how then do I feed these values into my optimizer?
@tryingtolearn Not sure I follow... What do you mean with "my init"? And how is your optimizer related?
Well, I initialize the model beforehand and calculate a p_matrix via a few placeholders. Once i have the matrix I want to just grab the key ones and backpropogate. From what I gather, you're suggesting that I make a new element called p_selected and then insert that in via feed_dict? Doesn't this affect how the network updates?
@tryingtolearn Hmm, sorry, I'm still not completely clear on how you want to use these values, maybe you could post some more code showing where you need to have these numbers and where would you like to put them?
@tryingtolearn Mmm, as far as I can see now (I'm from my phone so I may be misreading) it looks to me it should work with tf.gather_nd as it is (without changes to feed_dict)... Do you get any error if you give it a try?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.