Optimising log-linear learning implementation for multiple players in Python

Ask Question

Asked 1 year ago

Modified 1 year ago

Viewed 51 times

I am working on testing log-linear learning for the class of potential games, and I want to test it with more than two players. I am having issues optimising the code.

Currently, I am utilising the fact that log linear learning induces a Markov chain and the transitions can be explained with a transition matrix. However as the number of players increases the dimensions of the transition matrix increase rapidly making it impossible to use the dense numpy arrays to store the transition matrix. I have tried using the scipy.sparse matrices but the computation is very slow - the way the matrix is used is by multiplying stationary distribution with the transition matrix.

I would like to make it feasible to test for multiple players and optimise the computation time of the code.

Here is the formulation of the transition matrix with numpy.

self.action_profiles = enumerate(np.array(list(product(np.arange(self.no_actions), repeat = self.no_players))))
        
        self.potential = np.zeros((self.no_action_profiles, 1))
        
        P = np.zeros([self.no_action_profiles, self.no_action_profiles])     

        for idx, profile in self.action_profiles:
            
            self.potential[idx] = self.potential_function(profile)

            for player_id in range(self.no_players):
                
                mask = np.arange(len(profile)) != player_id
                opponents_actions = profile[mask] # extract the opponents actions from the action profile
                
                utilities = np.array([self.utility_functions[player_id](i, opponents_actions) for i in range(self.no_actions)])
                exp_values = np.exp(beta * utilities)
        
                p = exp_values/np.sum(exp_values)
                
                i = idx - profile[player_id]*self.no_actions**(self.no_players - 1 - player_id)
                stride = self.no_actions ** (self.no_players - 1 - player_id)
                
                P[idx, i: i + self.no_actions**(self.no_players - player_id) : stride] += 1/self.no_players*p
                     
        self.P = P

And with scipy.

self.action_profiles = enumerate(np.array(list(product(np.arange(self.no_actions), repeat = self.no_players))))
        
        self.potential = lil_matrix((self.no_action_profiles, 1))
        
        # P = np.zeros([self.no_action_profiles, self.no_action_profiles])     

        P_row, P_col, P_data = [], [], []

        for idx, profile in self.action_profiles:
            
            self.potential[idx] = self.potential_function(profile)

            for player_id in range(self.no_players):
                
                mask = np.arange(len(profile)) != player_id
                opponents_actions = profile[mask] # extract the opponents actions from the action profile
                
                utilities = np.array([self.utility_functions[player_id](i, opponents_actions) for i in range(self.no_actions)])
                exp_values = np.exp(beta * utilities)
        
                p = exp_values/np.sum(exp_values)
                
                i = idx - profile[player_id]*self.no_actions**(self.no_players - 1 - player_id)
                stride = self.no_actions ** (self.no_players - 1 - player_id)
                
                for j, prob in enumerate(p):
                    P_row.append(idx)
                    P_col.append(i + j * stride)
                    P_data.append(prob / self.no_players)      
        
        P = coo_matrix((P_data, (P_row, P_col)), shape=(self.no_action_profiles, self.no_action_profiles))
                
        self.P = P.tocsr()
                
        return self.P

And they are to be used like this.

P = self.gameSetup.formulate_transition_matrix(beta)
        mu0 = self.mu_matrix.copy()
        
        self.expected_value = np.zeros((int(self.max_iter), 1))
        
        P = np.linalg.matrix_power(P, scale_factor)
        
        for i in range(self.max_iter):
            
            mu = mu0 @ P
            
            mu0 = mu
            
            self.expected_value[i] = mu @ self.gameSetup.potential
        
        self.expected_value = self.expected_value
        self.stationary = mu

edited Nov 4, 2024 at 14:29

asked Nov 4, 2024 at 12:37

H.C.

416 bronze badges

I think Python is simply not suited for that (because it is interpreted and you have to pay Numpy overhead too). Generally, the solution is to vectorize the code so to make it faster but some operations cannot be (efficiently) vectorized here. For example, the iterative matrix multiplication. In such a case, Numba and Cython can help. However, Numba is not great for mixing pure-Python code with Numpy one. Using Numba for most of the code so to avoid pure-Python code can results in other tricky problems (eg. compilation time)... Not to mention there is no simple replacement for Scipy.

Jérôme Richard
– Jérôme Richard

2024-11-04 13:12:28 +00:00
Commented Nov 4, 2024 at 13:12
As for Cython, it cannot speed up the Numpy parts so you need to use plain loops and array views so to significantly speed up the code. One might use multiple threads to make the code faster but this assume there is no pure-Python code nor pure-Python objects anymore because of the GIL. Note to mention parallelism is interesting for codes lasting a significant time because of thread creation (and work-sharing overheads). This basically means no list, no Scipy sparse matrices, no external library objects, etc. (only preallocated arrays with views).

Jérôme Richard
– Jérôme Richard

2024-11-04 13:19:04 +00:00
Commented Nov 4, 2024 at 13:19
If you succeed to locate one or few specific lines taking the major part of the time (ideally without pure-Python code), then maybe we can focus only on this and maybe make it significantly faster. This means you should profile the code (or provide an minimal reproducible example). This is hard to really help you without that.

Jérôme Richard
– Jérôme Richard

2024-11-04 13:22:33 +00:00
Commented Nov 4, 2024 at 13:22
I assume that one part of why it is slow is the computation of all of the joint action profiles, but I think I can avoid that completely. From what I see the slowest parts are multiplication of the transition matrix with the distribution vector and the for loop running through all of the joint action profiles. When I want to test for 4 actions and 6 players the number of profiles is 4^6 and the size of the matrix is 4^6x4^6.

H.C.
– H.C.

2024-11-04 13:30:48 +00:00
Commented Nov 4, 2024 at 13:30
Large dense matrix multiplications are already aggressively optimized using BLAS library. They are micro-optimized using native parallel code so there is (nearly) no chance to make that faster. Sparse one are not really optimized but this is hard to do so. Multiplication to some specific sparse matrices can be significantly speed up (far more information are needed though to do that). 4**6=4096 so this is large matrices. even 1024 is relatively large. Note that tiny matrices are generally not efficient in Numpy (eg. 4x4 matrices) and more generally in Python.

Jérôme Richard
– Jérôme Richard

2024-11-04 14:31:25 +00:00
Commented Nov 4, 2024 at 14:31

| Show 2 more comments

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Optimising log-linear learning implementation for multiple players in Python

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest