2

I created a working (long) bit of code to check if a file exists in a directory or not and if so to extract two string values and convert them to float.

try:
    # use the file name from earlier and append .HDR
    filename = sel_file +'.HDR'

    # open the text file if found, read in the file
    f = open(filename, 'r+')
    data = f.read()
    f.close()

    # these regex are to look for the 2 strings and extracts the values in quotation marks after them
    # the value in quotes is a number written as string. There is only 1 CTRW & several PTR
    c_pat = r'CTRW,\"(.*?)\"'
    p_pat = r'\nPTR,\"(.*?)\"'

    # implements the regex on the file
    c = re.findall(c_pat, data)
    p = re.findall(p_pat, data)

    # for some reason the c & p are saved as lists so this converts to a string and then a float
    ctrw_str = int(float( ''.join(map(str,c))))
    ptr_str = int(float(''.join(map(str,p))))

    # saves to the necessary variable for use later in the code
    CTRW = ctrw_str
    PTR = ptr_str

except IOError:
    # If the file is not in the directory then the following values are used
    CTRW = 120
    PTR = 197
    pass

The file is read in as a string but for some unknow reason the regex captures and creates a list, which is not a huge deal but requires the extra step of converting to a string and then a float.

The search for the PTR value returns several within the source file but the the one I want is exactly PTR which is why I used the \n for the p_pat regex search and not in the c_pat regex.

I would like to see if any here have a good idea of how to shrink this into less lines and make it more Pythonic.

3
  • 1
    Can you clarify, does the file have multiple instances of PTR,"123" ? Commented Nov 9, 2020 at 21:47
  • Yes, it is written in the comments: "There is only 1 CTRW & several PTR", but you also mention you need "to extract two string values and convert them to float". Which is true? Commented Nov 9, 2020 at 22:03
  • a sample within the larger file that I am looking at has a single instance of CTRW,"200" and a single instance of exactly PTR,"175.00" but there are are versions of the string PTR located with other characters. within one text file there are ~11 instances of the string PTR, including PTR, PTRXA, PTRXB, DNPTR1, and TWCPTR. I am trying to grab the numbers in quotes so I can use them later within calculations with voltage and current values located in a seperate COMTRADE file. Commented Nov 9, 2020 at 22:52

1 Answer 1

1

No wonder you get lists, re.findall returns a list of found strings.

If - as you say - you want "to extract two string values and convert them to float" use re.search to fetch the first match:

try:
    filename = sel_file +'.HDR'
    with open(filename, 'r+') as f:
        data = f.read()
        c = re.search(r'CTRW,"([^"]*)"', data)
        if c:
            CTRW = int(float(c.group(1)))
        p = re.search(r'\nPTR,"([^"]*)"', data)
        if p:
            PTR = int(float(p.group(1)))
except IOError: # If the file is not in the directory then the following values are used
    CTRW = 120
    PTR = 197
    pass
Sign up to request clarification or add additional context in comments.

2 Comments

The above worked and definitely looks better than my hacked together version. Appreciate it.
@Neil Please kindly upvote the answer if it is helpful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.