I was looking at the docs ( the same one that @OliverRadini referred to ), and that same page states has the following:
header : int, list of int, default ‘infer’
Row number(s) to use as the
column names, and the start of the data. Default behavior is to infer
the column names: if no names are passed the behavior is identical to
header=0 and column names are inferred from the first line of the
file, if column names are passed explicitly then the behavior is
identical to header=None. Explicitly pass header=0 to be able to
replace existing names. The header can be a list of integers that
specify row locations for a multi-index on the columns e.g. [0,1,3].
Intervening rows that are not specified will be skipped (e.g. 2 in
this example is skipped). Note that this parameter ignores commented
lines and empty lines if skip_blank_lines=True, so header=0 denotes
the first line of data rather than the first line of the file
You're defining the names in code, so you shouldn't include the header in the file. Either do one (write headers in csv data ) or the other (write column names in code). Don't do both.
EDIT: My answer remains the same, but here's one way you could have discovered this yourself:
With the following csv data (what you showed in the picture):
BULAN,rt,nigak,niagab,sosum,soskhus,p,tni,ik,ib,TARGET
13-Jan,84876,902,1192,2098,3623,169,39,133,1063,94095
13-Feb,79194,902,1050,2109,3606,153,39,133,806,87992
13-Mar,75836,902,1060,1905,3166,161,39,133,785,83987
13-Apr,75571,902,112,1878,3190,158,39,133,635,82618
13-May,83797,1156,134,1900,3518,218,39,133,709,91604
13-Jun,91648,1291,127,2220,3596,249,39,133,659,99967
13-Jul,79063,1346,107,1844,3428,247,39,133,951,86798
Running this code...
from pandas import read_csv
from numpy import set_printoptions
namaFile = 'dataset.csv'
nama = ['rt', 'niagak', 'niagab', 'sosum', 'soskhus', 'p', 'tni', 'ik', 'ib', 'TARGET']
dataFrame = read_csv(namaFile, names=nama)
array = dataFrame.values
print("with names=nama...")
print(array)
dataFrame = read_csv(namaFile)
array = dataFrame.values
print("with no names...")
print(array)
dataFrame = read_csv(namaFile, names=nama, header=0)
array = dataFrame.values
print("with no names=nama and header=0...")
print(array)
You get this output:
with names=nama...
[['rt' 'nigak' 'niagab' 'sosum' 'soskhus' 'p' 'tni' 'ik' 'ib' 'TARGET']
['84876' '902' '1192' '2098' '3623' '169' '39' '133' '1063' '94095']
['79194' '902' '1050' '2109' '3606' '153' '39' '133' '806' '87992']
['75836' '902' '1060' '1905' '3166' '161' '39' '133' '785' '83987']
['75571' '902' '112' '1878' '3190' '158' '39' '133' '635' '82618']
['83797' '1156' '134' '1900' '3518' '218' '39' '133' '709' '91604']
['91648' '1291' '127' '2220' '3596' '249' '39' '133' '659' '99967']
['79063' '1346' '107' '1844' '3428' '247' '39' '133' '951' '86798']]
with no names...
[['13-Jan' 84876 902 1192 2098 3623 169 39 133 1063 94095]
['13-Feb' 79194 902 1050 2109 3606 153 39 133 806 87992]
['13-Mar' 75836 902 1060 1905 3166 161 39 133 785 83987]
['13-Apr' 75571 902 112 1878 3190 158 39 133 635 82618]
['13-May' 83797 1156 134 1900 3518 218 39 133 709 91604]
['13-Jun' 91648 1291 127 2220 3596 249 39 133 659 99967]
['13-Jul' 79063 1346 107 1844 3428 247 39 133 951 86798]]
with no names=nama and header=0...
[[84876 902 1192 2098 3623 169 39 133 1063 94095]
[79194 902 1050 2109 3606 153 39 133 806 87992]
[75836 902 1060 1905 3166 161 39 133 785 83987]
[75571 902 112 1878 3190 158 39 133 635 82618]
[83797 1156 134 1900 3518 218 39 133 709 91604]
[91648 1291 127 2220 3596 249 39 133 659 99967]
[79063 1346 107 1844 3428 247 39 133 951 86798]]
We can see clearly here that when you include the names on both, you get the headers listed in the first item, which is not what we want. When you remove the names=nama then you get all of the data from the file. When you explicitly over-write the names with names=nama header=0, you also can achieve this desired result. HOWEVER I would also like to note that your headers in your code are missing the BULAN column so be careful with that.
print() is your friend. Use it. It will tell you what your problems are.
read_csvis attempting to read the parse the data from the first row, which is used for headings in the data you have. The documentation for this function gives details on how to specify header row(s) pandas.pydata.org/pandas-docs/stable/reference/api/…header: ... if no names are passed the behavior is identical to header=0 and column names are inferred from the first line of the file... you're defining the names in code, so you shouldn't include the header in the file. Either do one (write headers in csv data ) or the other (write column names in code). Don't do both.