0

I have a matrix of floats shaped (3000, 9). Across 1 line, there is 1 ''simulation''. Across columns, for a fixed line, there's the contents of the ''simulation''.

I want that for each simulation, the first 8 columns to be normalized to the sum of the 8 first columns. That is, the first column's entry (for one fixed line) to become what was before, over the sum of the first 8 columns (for that same fixed line).

A trivial task, but I get from a nice, correct, graph (non-normalized), something totally unphysical when plotting with plt.scatter.

The last column of each line is what we are going to use for the x-axis to plot the first 8 columns (the y values). So one line will represent 8 datapoints for 1 fixed value of x.

The non-normalized graph: https://ibb.co/Msr8RVB

The normalized graph: https://ibb.co/tJp7bZn

The datasets: non-normalized: https://easyupload.io/oat9kq

My code:

import numpy as np
from matplotlib import pyplot as plt


non_norm = np.loadtxt("integration_results_3000samples_10_20_10_25_Wcm2_BenSimulationFromSlack.txt")

plt.figure()
for i in range(non_norm.shape[1]-1):
    plt.scatter(non_norm[:, -1], non_norm[:, i], label="c_{}".format(i+47))
plt.xscale("log")
plt.savefig("non-norm_Ben3000samples.pdf", bbox_inches='tight')

norm = np.empty( (non_norm.shape[0], non_norm.shape[1]) )
norm[:, -1] = non_norm[:, -1]

for i in range(norm.shape[1]-1):
    for j in range(norm.shape[0]):
        norm[j, i] = np.true_divide(non_norm[j, i] , np.sum(non_norm[j, :-1]))

plt.figure()
for i in range(norm.shape[1]-1):
    plt.scatter(norm[:, -1], norm[:, i], label="c_{}".format(i+47))
plt.xscale("log")
plt.savefig("norm_Ben3000samples.pdf", bbox_inches='tight')

Do you see what went wrong? Thank you

8
  • Can you post a sample of non_norm? Commented Jul 7, 2021 at 15:45
  • @not_speshal, what do you mean by a sample of non_norm? Thanks. non_norm is extracted from the .txt I uploaded at: file.io/deleted. Edit: the file has been deleted by unknown reasons. Commented Jul 7, 2021 at 15:51
  • What about this link: easyupload.io/oat9kq. Does it work? Commented Jul 7, 2021 at 15:52
  • can you check the output of print(non_norm[:10]) before plotting? When I run your code, I get a whole lot of np.nan values. Commented Jul 7, 2021 at 16:12
  • 1
    You realise when you're normalising a row that has just one value and 7 zeroes, the value becomes 1 and the rest of the row is 0? This is likely why your plot is messing up. Plot each column one by one (normalized and non-normalized) and you'll see what I mean. Commented Jul 7, 2021 at 16:38

1 Answer 1

1

When you're normalising a row that has just one value and 7 zeroes, the value becomes 1 and the rest of the row is 0? This is likely why your plot is messing up.

For example, the plot for the first column looks like this before and after normalization:

before

after

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.