0

Is there a simple way with gnuplot to import data with fixed widths of the columns?

The problem is that columns could be empty, so importing with set datafile separator whitespace will not work correctly. I'm aware that you can preprocess data with external tools like awk, sed, etc. but I'm wondering if there might be a simple platform independent gnuplot-only solution.

The solution I've come up with is a bit lengthy, but at least it seems to work. If there is a simpler gnuplot-way please let me know.

Code:

### data with fixed column widths
reset session

$DataRaw <<EOD
# Data with fixed but also empty columns
#0000000011111111112222222222333333333344444
#2345678901234567890123456789012345678901234
#--6-||----11---||--7--||-----12---||--8---|
   1.1     1.2222   1.03   some Text   1.555
   2.1    -2.2222                     -2.555
  -3.1     3.2222  -3.03   more text   3.555
   4.1             -4.03              -4.555
                             no data   0.000
   6.1    -6.2222   6.03  no comment  -6.555
EOD

# define the widths of the columns
array FixCols[5] = [6,11,7,12,8]
set datafile separator "\n"
CommentChar = "#"
Separator = ','

# define strip() function workaround to remove spaces at beginning and end of a string
strip(s) = (STRP_a=1, STRP_b=1, \
           sum [STRP_i=1:strlen(s)] ((s[STRP_i:STRP_i] eq " ") ? \
               (STRP_a>0  ? STRP_a=STRP_a+1 : 0) : (STRP_a=-abs(STRP_a), STRP_b=STRP_i) \
           ), s[abs(STRP_a):STRP_b] )

set print $Data
    do for [i=1:|$DataRaw|] {
        if ($DataRaw[i][1:1] ne CommentChar) {
            Line = ''
            Start = 1
            do for [j=1:|FixCols|] {
                End = Start + FixCols[j]-1
                Line = Line.strip($DataRaw[i][Start:End]).(j<|FixCols| ? Separator : "")
                Start = Start + FixCols[j]
            }
            print Line
        }
        else { print $DataRaw[i]}  # print the unchanged commented line
    }
set print

print $Data
### end of code

Result:

# Data with fixed but also empty columns
#0000000011111111112222222222333333333344444
#2345678901234567890123456789012345678901234
#--6-||----11---||--7--||-----12---||--8---|
1.1,1.2222,1.03,some Text,1.555
2.1,-2.2222,,,-2.555
-3.1,3.2222,-3.03,more text,3.555
4.1,,-4.03,,-4.555
,,,no data,0.000
6.1,-6.2222,6.03,no comment,-6.555

2 Answers 2

1

gnuplot has always had the option to specify a format following the using spec, but the implementation predates the introduction of string variables. So you can read numbers from a line with fixed width fields, but I can't immediately see how to read the field content as a string in the same command. Input scanning uses the C language routine sscanf(). Numbers always require format spec %lf. To skip an N character field use format spec %*Nc.

$DataRaw <<EOD
# Data with fixed but also empty columns
#0000000011111111112222222222333333333344444
#2345678901234567890123456789012345678901234
#--6-||----11---||--7--||-----12---||--8---|
   1.1     1.2222   1.03   some Text   1.555
   2.1    -2.2222                     -2.555
  -3.1     3.2222  -3.03   more text   3.555
   4.1             -4.03              -4.555
                             no data   0.000
   6.1    -6.2222   6.03  no comment  -6.555
EOD

set table 
splot $DataRaw skip 4 using 1:2:3:(sprintf("%g",$4)) "%6lf%11lf%7lf%*12c%8lf" with labels

produces

# Surface 0 of 1 surfaces

# Curve title: "$DataRaw skip 4 using 1:2:3:(sprintf("%g",$4)) "%6lf%11lf%7lf%*12c%8lf""
 1.1  1.2222  1.03 "1.555"
-3.1  3.2222 -3.03 "3.555"
 6.1 -6.2222  6.03 "-6.555"
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I didn't know about this. Unfortunately, it doesn't help because it can't read text and skips the whole row if there is an empty columns instead of inserting, e.g. empty space or NaN.
0

Here is a simpler solution (compared to the one in the question) which also requires gnuplot>=5.2.0 because it is using indexing of datablocks. It will simply insert separators (here: comma) at the appropriate locations and print it to a new table which can then easily be plotted via set datafile separator ','.

Other attempts for earlier gnuplot versions parsing the original table using, e.g. set datafile separator "\t" and strcol(1) will fail because gnuplot will always strip leading and trailing spaces, i.e. in the example below the correct reading of line 9 will always fail.

I am aware that I could use external tools to preprocess the data, but if possible I prefer gnuplot-only and hence ensure platform-independence.

Script:

### data with fixed column widths
reset session

$DataInput <<EOD
# Data with fixed but also empty columns
#0000000011111111112222222222333333333344444
#2345678901234567890123456789012345678901234
#--6-||----11---||--7--||-----12---||--8---|
  -1.1     1.2222   1.03   some Text   1.555
   2.1    -2.2222                     -2.555
   3.1     3.2222  -3.03   more text   3.555
   4.1             -4.03              -4.555
                             no data   0.000
   6.1    -6.2222   6.03  no comment  -6.555
EOD

colWidths    = "6 11 7 12 8"  # specify the fixed column widths
commentChar  = "#"
colSeparator = ','
set datafile separator "\n"

insertSeparators(s) = (s0='', p0=1, sum [_i=1:5] (p1=p0+word(colWidths,_i), \
    s0 = s0.s[p0:p1-1].colSeparator, p0=p1, 0), s0[1:strlen(s0)-1])

set print $Data
    do for [i=1:|$DataInput|] {
        if ($DataInput[i][1:1] eq commentChar) { print ($DataInput[i]) }
        else { print insertSeparators($DataInput[i]) }
    }
set print
print $Data

set offset 1,1,1,1
set datafile separator colSeparator

plot $Data u 1:5 w lp pt 7 lc "red", \
        '' u 1:5:4 w labels offset 0,0.8
### end of script

Result:

# Data with fixed but also empty columns
#0000000011111111112222222222333333333344444
#2345678901234567890123456789012345678901234
#--6-||----11---||--7--||-----12---||--8---|
  -1.1,     1.2222,   1.03,   some Text,   1.555
   2.1,    -2.2222,       ,            ,  -2.555
   3.1,     3.2222,  -3.03,   more text,   3.555
   4.1,           ,  -4.03,            ,  -4.555
      ,           ,       ,     no data,   0.000
   6.1,    -6.2222,   6.03,  no comment,  -6.555

enter image description here

Addition:

For completeness, here is an alternative approach:

  • typically, data is provided in a file, but the solutions above and the following require the data in a datablock. This can be achieved by this script snippet.
  • the following script defines functions fcol() and fstrcol(), basically as replacement for column() and strcol(), just for fixed length columns.
  • since real('') will return an error and real(' ') will return 0.0 (at least for gnuplot<=5.5), NaN is added to each substring in fcol(), hence returning NaN for these cases.

Data: SO58245920.dat

# Data with fixed but also empty columns
#0000000011111111112222222222333333333344444
#2345678901234567890123456789012345678901234
#--6-||----11---||--7--||-----12---||--8---|
  -1.1     1.2222   1.03   some Text   1.555
   2.1    -2.2222                     -2.555
   3.1     3.2222  -3.03   more text   3.555
   4.1             -4.03              -4.555
                             no data   0.000
   6.1    -6.2222   6.03  no comment  -6.555

Script: (requires gnuplot>=5.2.0, Result basically like the graph above)

### read/plot data with fixed column width
reset session

FILE = 'SO58245920.dat'

# get the file 1:1 into a datablock
FileToDatablock(f,d) = GPVAL_SYSNAME[1:7] eq "Windows" ? \
                       sprintf('< echo   %s ^<^<EOD  & type "%s"',d,f) : \
                       sprintf('< echo "\%s   <<EOD" & cat  "%s"',d,f)     # Linux/MacOS
load FileToDatablock(FILE,'$Data')

colWidths     = "6 11 7 12 8"              # specify the fixed column widths
colStarts     = (s='1', s0=0, sum[i=1:words(colWidths)] (s0=s0+word(colWidths,i),s=s.' '.(s0+1),0),s)
colStart(col) = int(word(colStarts,col))
commentChar   = "#"

fcol(col)    = real((i=int($0+1), $Data[i][1:1] eq commentChar ? NaN : \
               $Data[i][colStart(col):colStart(col)+word(colWidths, col)-1].'NaN'))
fstrcol(col) = (i=int($0+1), $Data[i][1:1] eq commentChar ? '' : \
               $Data[i][colStart(col):colStart(col)+word(colWidths, col)-1])

set datafile commentschar ''
set datafile missing NaN
set key noautotitle
set offset 1,1,1,1

plot $Data u (fcol(1)):(fcol(5)) w lp pt 7 lc "red", \
        '' u (fcol(1)):(fcol(5)):(fstrcol(4)) w labels offset 0,1.0
### end of script

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.