0

I am trying to import a .dat file that is output from my experiments as metadata in the header lines and then the data of the experiment itself afterwards (after the line with dash lines). My idea was to strip it so that I have a list of strings variable containing the metadata and another variable as a dataframe with the results (the part below the dashes). I am having trouble trying to import the data below as data frame since the metadata above is classified as a list of strings and therefore the whole file stays in this format. Is there a way to get the data as a data frame and not as a list of strings?

Learned-Helplesness-Experiment  (TriplePlatform)  from      05.04.2017         13:41:24

software version:   DoublePlatform_1.3 04-Jun-2014

Setup of Experiment:    

Platform 1: 
ExpType:    M   M   M   M   M   M   M   M   M   M   

heated side:    right   right   right   right   right   right   right       right   right   right   

PIs:     n. def.     0   0   0   0   0   0   0   0   0  

Platform 2: 
ExpType:    Te  Te  Te  Y   Te  Y   Y   Y   Y   Y   

heated side:    right   right   right   ->M right   ->M ->M ->M ->M ->M 

PIs:     n. def.     0   0   0   0   0   0   0   0   0  

Platform 3: 
ExpType:    Y   Y   Y   Y   M_S Y   Y   Y   Y   Y   

heated side:    ->M ->M ->M ->M right   ->M ->M ->M ->M ->M 

PIs:     n. def.     0   0   0   0   0   0   0   0   0  


------------------------------------    ------------------------------------

 0   0   0   0   0
 1   47 -0.3759766   0.1123047   0.3710938
 2   97  0.01953125 -0.1318359   0.1123047
 3   157    -0.4150391   0.2246094   0.3369141
 4   207    -0.01953125 -0.2539063   0.1318359
 5   257    -0.3515625   0.3027344   0.3222656

1 Answer 1

1

I guess you are using pandas? I think there is no "general" way of doing this. You could open/parse the file manually (until the "dash lines"). The part until the dash line you keep as "list of strings". Then you tell pandas to import the rest starting with line number x (where you found the dashes). The option is called skiprows.

Edit1 (in response to the comment):

That depends on whether your header has a constant number of rows. If not, you might want to read through the file line by line, looking for the dashes:

with open('filename', 'r') as file:
    line_no = 0
    for line in file.read():
        line_no += 1
        if line.startswith('-'*37):
            # do sth
            break
        else:
            # do sth

Edit2

To import the data part, you could use

pandas.read_csv(..., sep='\t', skiprows=line_no)

in case tab is the field delimiter, or

pandas.read_csv(..., delim_whitespace=True, skiprows=line_no)

if the fields are delimited by one (or more) blanks

Sign up to request clarification or add additional context in comments.

3 Comments

If I use something like this: f = open(filename,'r') fly_data = f.readlines()[36:]; f.close() it will still be read as a list of strings. I have tried with several numpy functions and pandas functions. But I did not find anyone that works so far. I am just starting with python so that is why I was expecting anyone that knows a function that works for this
what is the delimiter of the data part? Is it tab, or blank?
It is delimited by blanks, thus, the last one worked nicely. Thanks man! I still did not try the code in Edit1, I guess I have to write something like file.readlines()[lines_no] after the if condition and then it should work. I need to get familiar with the python nomenclature yet :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.