0

I have been creating a few regex patterns to search a file. I basically need to search each line of a text file as a string of values. The issue I am having is that the regexs I have created work when used against a list of of values; however, I can not use the same regex when I search a string using the same regex. I'm not sure what I am missing. My test code is below. The regex works against the list_primary, but when I change it to string2, the regex does not find the date value I'm looking for.

import re

list_primary = ["Wi-Fi", "goat", "Access Point", "(683A1E320680)", "detected", "Access Point detected",  "2/5/2021", "10:44:45 PM",  "Local",  "41.289227",  "-72.958748"]
string1 = "Wi-Fi Access Point (683A1E320680) detected puppy Access Point detected 2/5/2021 10:44:45 PM Local 41.289227 -72.958748"
#Lattitude = re.findall("[0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
#Longitude = re.findall("[-][0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
string2 = string1.split('"')
# print(string2)

list1 = []

for item in string2:

    data_dict = {}

    date_field = re.search(r"(\d{1})[/.-](\d{1})[/.-](\d{4})$",item)
    print(date_field)

    if date_field is not None:
        date = date_field.group()
    else:
        date = None
3
  • for item in string2: means you iterate over each char in string1. You need to re.search against string1 Commented Mar 23, 2021 at 14:47
  • I gave that a shot. It just prints a none value for each object. However, if I run that against the list that has the date "2/5/2021", the regex finds the value. Commented Mar 23, 2021 at 15:50
  • See ideone.com/itV7XS. To use it with a list you need something like rx = re.compile(r"(?<!\d)\d{1,2}[/.-]\d{1,2}[/.-]\d{4}(?!\d)") and then print(list(filter(rx.search, list_primary))) Commented Mar 23, 2021 at 15:53

1 Answer 1

1

For your current expression to work on the string, you need to delete the dollar sign from the end. Also, in order to find double digit dates (meaning 11/20/2018), you need to change your repetitions (since with your regex you can only find singular digits dates like 2/5/2011):

import re

list_primary = ["Wi-Fi", "goat", "Access Point", "(683A1E320680)", "detected", "Access Point detected",  "2/5/2021", "10:44:45 PM",  "Local",  "41.289227",  "-72.958748"]
string1 = "Wi-Fi Access Point (683A1E320680) detected puppy Access Point detected 2/5/2021 10:44:45 PM Local 41.289227 -72.958748"
#Lattitude = re.findall("[0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
#Longitude = re.findall("[-][0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
string2 = string1.split('"')
# print(string2)

list1 = []

for item in string2:

    data_dict = {}

    date_field = re.search(r"(\d{1,2})[/.-](\d{1,2})[/.-](\d{4})",item)
    print(date_field)

    if date_field is not None:
        date = date_field.group()
    else:
        date = None

Output:

re.Match object; span=(71, 79), match='2/5/2021'>

If you want to extract the date from your string (rather than just search if it exists), include a capturing group around your whole expression in order to see your date as one string and not as 3 different numbers:

date_field = re.findall(r"(\d{1,2}[/.-]\d{1,2}[/.-]\d{4})",string1)
print(date_field)

Output:

['2/5/2021']
Sign up to request clarification or add additional context in comments.

1 Comment

Removing the $ worked perfect. Thank you

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.