-1

So I have a sample .dat file which contains weather data for a single month as space-separated values. The first column of the file contains the day of the month; the second contains the maximum temperature for that day, while the third contains the minimum temperature.

I also have final row at the bottom which contains aggregate values for the entire month.

Ideally I want to write a program to find the row with the maximum spread in the .dat file, where spread would be the difference between maximum temperature and minimum temperature.

I would want my program to print the day of the month and spread to standard output.

Assuming that my program is called weather.py, then a sample run will look like:

$ python weather.py
2 16

And here is is my .dat file:

    Dy MxT   MnT   AvT   HDDay  AvDP 1HrP TPcpn WxType PDir AvSp Dir MxS SkyC MxR MnR AvSLP

   1  88    59    74          53.8       0.00 F       280  9.6 270  17  1.6  93 23 1004.5
   2  79    63    71          46.5       0.00         330  8.7 340  23  3.3  70 28 1004.5
   3  77    55    66          39.6       0.00         350  5.0 350   9  2.8  59 24 1016.8
   4  77    59    68          51.1       0.00         110  9.1 130  12  8.6  62 40 1021.1
   5  90    66    78          68.3       0.00 TFH     220  8.3 260  12  6.9  84 55 1014.4
   6  81    61    71          63.7       0.00 RFH     030  6.2 030  13  9.7  93 60 1012.7
   7  73    57    65          53.0       0.00 RF      050  9.5 050  17  5.3  90 48 1021.8
   8  75    54    65          50.0       0.00 FH      160  4.2 150  10  2.6  93 41 1026.3
   9  86    32*   59       6  61.5       0.00         240  7.6 220  12  6.0  78 46 1018.6
  10  84    64    74          57.5       0.00 F       210  6.6 050   9  3.4  84 40 1019.0
  11  91    59    75          66.3       0.00 H       250  7.1 230  12  2.5  93 45 1012.6
  12  88    73    81          68.7       0.00 RTH     250  8.1 270  21  7.9  94 51 1007.0
  13  70    59    65          55.0       0.00 H       150  3.0 150   8 10.0  83 59 1012.6
  14  61    59    60       5  55.9       0.00 RF      060  6.7 080   9 10.0  93 87 1008.6
  15  64    55    60       5  54.9       0.00 F       040  4.3 200   7  9.6  96 70 1006.1
  16  79    59    69          56.7       0.00 F       250  7.6 240  21  7.8  87 44 1007.0
  17  81    57    69          51.7       0.00 T       260  9.1 270  29* 5.2  90 34 1012.5
  18  82    52    67          52.6       0.00         230  4.0 190  12  5.0  93 34 1021.3
  19  81    61    71          58.9       0.00 H       250  5.2 230  12  5.3  87 44 1028.5
  20  84    57    71          58.9       0.00 FH      150  6.3 160  13  3.6  90 43 1032.5
  21  86    59    73          57.7       0.00 F       240  6.1 250  12  1.0  87 35 1030.7
  22  90    64    77          61.1       0.00 H       250  6.4 230   9  0.2  78 38 1026.4
  23  90    68    79          63.1       0.00 H       240  8.3 230  12  0.2  68 42 1021.3
  24  90    77    84          67.5       0.00 H       350  8.5 010  14  6.9  74 48 1018.2
  25  90    72    81          61.3       0.00         190  4.9 230   9  5.6  81 29 1019.6
  26  97*   64    81          70.4       0.00 H       050  5.1 200  12  4.0 107 45 1014.9
  27  91    72    82          69.7       0.00 RTH     250 12.1 230  17  7.1  90 47 1009.0
  28  84    68    76          65.6       0.00 RTFH    280  7.6 340  16  7.0 100 51 1011.0
  29  88    66    77          59.7       0.00         040  5.4 020   9  5.3  84 33 1020.6
  30  90    45    68          63.6       0.00 H       240  6.0 220  17  4.8 200 41 1022.7
mo 82.9 60.5 71.7 16 58.8 0.00 6.9 5.3

My problem is that Im trying to figure out how to get the maximum spread. I've so far read the file and printed out the values. What would be my next steps to get the maximum spread?

My code so far:

#!/usr/bin/env python


# read and print weather file
filename = "weather.dat"

with open(filename) as fn:
    content = fn.readlines()

print(content)

Any leads and assistance to this would be helpful.

4
  • 1
    Iterate over the file; split each line on white space; extract the day and temperature extremes and subtract; compare with the previously saved largest spread; if it is bigger - save this day and its spread; if it is not bigger continue. Commented Jan 16, 2017 at 6:10
  • Is this a homework question? Commented Jan 16, 2017 at 6:11
  • Are you implying I use a for-loop kind of? Commented Jan 16, 2017 at 6:13
  • stackoverflow.com/a/17949545/2823755 Commented Jan 16, 2017 at 6:31

1 Answer 1

1

You can try with pandas like so:

import pandas as pd

df = pd.read_csv('your_file.dat', sep='\s+')
df[['MxT', 'MnT']] = df[['MxT', 'MnT']].apply(lambda x: x.str[:2].astype(int))
a = df.MxT - df.MnT
b = a.index[a==max(a)].tolist()
df.loc[b]

Output:

enter image description here

If you just want the Day, MxT and MnT, you can get it like this:

df.loc[b][['Dy', 'MxT', 'MnT']].unstack().tolist()

Output:

[9, 86, 32]
Sign up to request clarification or add additional context in comments.

7 Comments

OK, could you like explain to me why you used the Panda import? Like also a breakdown of the solution. Thanks
@kimaiga Pandas is a Data Analysis tool. Read here more about it: pandas.pydata.org/pandas-docs/stable/10min.html Or watch this series: youtube.com/…
@kimaiga Check updated solution. The max and min spreads are already marked as * in your .dat file. You can use that if you are allowed.
Well it doesnt print my output on my console as I expected, gives me blank output
Use a print statement on the last line. I do not require it because I'm using IPython notebook. For eg. print (df2.unstack().tolist())
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.