1

I am having a problem with my genetic feature optimization algorithm that I am attempting to build. The idea is that a specific combination of features will be tested and if the model accuracy using those features is higher than the previous maximum, then the combination of features replaces the previous maximum combination. through running through the remaining potential features in this way, the final combination should be the optimal combination of features given the feature vector size. Currently, the code that looks to achieve this looks like:

def mutate_features(features, feature):
    new_features = features
    index = random.randint(0,len(features)-1)
    new_features[index] = feature
    return new_features

def run_series(n, f_list, df):
    features_list = []
    results_list = []
    max_results_list = [[0,0,0,0,0]]
    max_feature_list = []
    features = [0,0,0,0,1]
    for i in range(0,5):  # 5 has just been chosen as the range for testing purposes
        results = run_algorithm(df, f_list, features)
        features_list.append(features)
        results_list.append(results)
        if (check_result_vector(max_results_list, results)):
            max_results_list.append(results)
            max_feature_list.append(features)
        else:
            print("Revert to previous :" +str(max_feature_list[-1]))
            features = max_feature_list[-1]
        features = mutate_features(features, f_list[i])
        print("Feature List = " +str(features_list))
        print("Results List = " +str(results_list))
        print("Max Results List = " +str(max_results_list))
        print("Max Feature List = "+str(max_feature_list))

The output from this code has been included below;

Output Click to zoom or enlarge the photo

The section that I do not understand is the output of the max_feature_list and feature_list.

If anything is added through the use of .append() to the max_feature_list or the feature_list inside the for loop, it seems to change all items that are already members of the list to be the same as the latest addition to the list. I may not be fully understanding of the syntax/logic around this and would really appreciate any feedback as to why the program is doing this.

3
  • You sure you don't modify max_feature_list inside your (not shown) mutate_features function? Commented Mar 28, 2020 at 13:47
  • I don't believe it is modified, I will edit the original post to show the mutate features function. Commented Mar 28, 2020 at 13:51
  • Easiest way to debug this is to include a temporary print(max_feature_list) statement just after max_feature_list.append(features). Commented Mar 28, 2020 at 13:52

1 Answer 1

3

It happens because you change the values of features inside mutate_features function and then, since the append to max_feature_list is by reference, the populated values in max_feature_list are changing too because their underlying value changed.

One way to prevent such behaviour is to deepcopy features inside mutate_features, mutate the copied features as you want and then return it.

For example:

from copy import deepcopy

def mutate_features(features, feature):
    new_features = deepcopy(features)
    index = random.randint(0,len(features)-1)
    new_features[index] = feature
    return new_features


features = [1, 2, 3]
res = []
res.append(features)
features = mutate_features(features, feature)
res.append(features)
print(res)
Sign up to request clarification or add additional context in comments.

4 Comments

features is not changed inside mutate_features; this was my first thought, too, but after OP posting the code this is evidently not the case.
new_features and features are 2 references to the same value (the assignment of features to new_features is by ref and not by value) so in that case, changing new_features is the same as changing features.
This simple change fixed the issue! I was not aware that append() to max_feature_list is by reference, thank you for your help.
@AdamWhitrow happy to help :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.