python find digits string in line

Question

I have this kind of long log file.

2012-02-03 18:35:34 SampleClass6 [INFO] everything normal for id 174025851
2012-02-03 18:35:34 SampleClass4 [FATAL] system problem at id 1991740254
2012-02-03 18:35:34 SampleClass3 [DEBUG] detail for id 1304807656
2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740

I want to find id=1740 exactly, and print line, but id=174025851 also count in it. how can I find exactly string id=1740 in a line and print line.

for line in f: 
    if str(id) in line: 
        print(line)

it also print the first and second line but I just want 4th line only with exactly id 1740

If you mean exactly then line.strip().endsWith(“id 1740”) or something similar? (Not at a machine to test this atm.) — user3456014
– user3456014, Commented Jun 23, 2020 at 17:27
if you're using the re module then wouldn't re.search(r'\b1740\b',line) suffice? — Umar.H
– Umar.H, Commented Jun 23, 2020 at 17:28
Zaman Azam: I updated my answer to reflect what you wrote under pciunkiewicz's answer regarding the fact that the "id" could occur in the middle of the string as well as the end. This detail should have been stated up front in the question, because not knowing it invalidates a number of answers. Please can you edit your question now to reflect the requirements. — alani
– alani, Commented Jun 23, 2020 at 18:16

alani · Accepted Answer · 2020-06-23 18:13:43Z

1

At risk of adding yet another answer to a question which has many already, here is how I think that a regular expression parser is best used here:

import re

the_id = 1740

with open("test.txt") as f:
    for line in f:
        match = re.search("id\s+(\d+)\s*$", line)
        if match and the_id == int(match.group(1)):
            print(line, end='')

This gives:

2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740

What you are doing here is using the parser to look for lines which end with the following: "id", followed by whitespace, followed by one or more digits (which you capture in a group), optionally followed by any amount of whitespace.

The captured group is then converted to int and compared with the id.

Incidentally, the id is stored in variable called the_id, because id is the name of a builtin function so is not a good choice of variable name (interferes with use of the builtin).

UPDATE

The asker has now clarified that the ID can appear in the middle of the line, not necessarily at the end.

This can easily be handled by a simple tweak to the regular expression. Changing the relevant line in the above code to:

        match = re.search("id\s+(\d+)", line)

now removes any check on what should come after the digits.

Because the + meaning "one or more" is also greedy (that is, it matches the part of the pattern to which it relates as many times as possible), the whole of the ID is matched by the bracketed group, without need to specify anything about what follows it.

Given the input file

2012-02-03 18:35:34 SampleClass6 [INFO] everything normal for id 174025851
2012-02-03 18:35:34 SampleClass4 [FATAL] system problem at id 1991740254
2012-02-03 18:35:34 SampleClass3 [DEBUG] detail for id 1304807656
2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740
2012-02-03 19:11:02 id 1740 SampleClass5 [TRACE] verbose detail

this will now output:

2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740
2012-02-03 19:11:02 id 1740 SampleClass5 [TRACE] verbose detail

edited Jun 23, 2020 at 18:13

answered Jun 23, 2020 at 17:55

alani

13.2k3 gold badges18 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Zaman Azam Over a year ago

is this also wor on this line

Zaman Azam Over a year ago

Feb 23 07:38:19 router1 snmpd[359]: SNMPD_TRAP_COLD_START: trap_generate_cold: SNMP 1740 trap: cold start

alani Over a year ago

@ZamanAzam That does not even contain the string id. How is the code supposed to know that that is an id?

Zaman Azam Over a year ago

So I have different log files. Which I filter using .log. Then I read first file. And every line in file contains different number strings. The number(Id) is given by user. If it’s exactly same in the line. Then it print line

Zaman Azam Over a year ago

snmpd[359]: SNMPD_TRAP_WARM_START:SNMP trap: warm start SNMPD_THROTTLE_QUEUE_DRAINED:() cleared all throttled traps SNMPD_TRAP_WARM_START:SNMP trap:(1740;543;544) warm start SNMPD_TRAP_COLD_START:SNMP trap:(1740737)cold start

|

Amit Kumar · Accepted Answer · 2020-06-23 17:30:13Z

0

You can use regex something like id followed by space. Or if the id is always at the end of the line. Then use If line.endswith('id '+id) is true then do your logic.

answered Jun 23, 2020 at 17:30

Amit Kumar

6433 silver badges15 bronze badges

Comments

Polypro · Accepted Answer · 2020-06-23 17:33:40Z

0

You can use regular expressions

import re
text = """
2012-02-03 18:35:34 SampleClass6 [INFO] everything normal for id 174025851
2012-02-03 18:35:34 SampleClass4 [FATAL] system problem at id 1991740254
2012-02-03 18:35:34 SampleClass3 [DEBUG] detail for id 1304807656
2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740
"""

# the \s means the char after 0 must be a space, tab or newline (so, not a number)
p = re.compile(r'.*id 1740\s') 
ls = p.findall(text)

answered Jun 23, 2020 at 17:33

Polypro

872 silver badges7 bronze badges

Comments

Adrien Wehrlé · Accepted Answer · 2020-06-23 17:37:11Z

0

Here is another possibility:

import pandas as pd

data=pd.read_csv('/path/to/data.txt', header=None)

for i in range(0,len(data)):
    if data.iloc[i,0].split(' ')[-1][:4]=='1740':
        print(data.iloc[i,0])

I use csv even if not comma-separated so that I keep lines as single strings! Then check within a loop.

answered Jun 23, 2020 at 17:37

Adrien Wehrlé

3011 silver badge4 bronze badges

Comments

David Erickson · Accepted Answer · 2020-06-23 17:38:32Z

0

You could do something like this and split the line and take the last value:

for line in f: 
    if '1740' in line:
        a = line.split(' ')[-1]
        if a == '1740': 
            print(line)

edited Jun 23, 2020 at 17:38

answered Jun 23, 2020 at 17:28

David Erickson

16.7k2 gold badges21 silver badges37 bronze badges

1 Comment

alani Over a year ago

I don't think want you to hard-code the value '1740' (especially not having the same constant hard-coded in two places), but you might want to take the opportunity to suggest a better variable name for it than the id used in the question (which is the name of a builtin).

Philip Ciunkiewicz · Accepted Answer · 2020-06-23 17:41:05Z

0

Based on the structure of your logs, if the id is always at the very end of the line you can always change it to look for lines ending with your exact query:

for line in f: 
    if line.endswith(f"id {id}"): 
        print(line)

Edit:

As mentioned by @mrblewog, if the line has trailing whitespace, we can pre-process it using rstrip or strip:

for line in f:
    line = line.rstrip()
    ### Rest of the logic ###

edited Jun 23, 2020 at 17:41

answered Jun 23, 2020 at 17:29

Philip Ciunkiewicz

2,8183 gold badges15 silver badges25 bronze badges

6 Comments

user3456014 Over a year ago

Potentially need to strip whitespace off the line before endswith

David Erickson Over a year ago

what if the line id to search for is 1740 but the line ends with 1991740? This wouldn't work.

Philip Ciunkiewicz Over a year ago

@DavidErickson Ahh yeah you're right! I have updated it to be more precise, starting all the way back from id XXXXXXXX.

Zaman Azam Over a year ago

id can also appear in middle, its a long log file

Zaman Azam Over a year ago

2012-02-03 19:11:02 id 1740 SampleClass5 [TRACE] verbose detail

|

Abhishek Bhagate · Accepted Answer · 2020-06-23 18:00:04Z

0

You could do it with regex as -

import re
file = ['2012-02-03 18:35:34 SampleClass6 [INFO] everything normal for id 174025851',
'2012-02-03 18:35:34 SampleClass4 [FATAL] system problem at id 1991740254',
'2012-02-03 18:35:34 SampleClass3 [DEBUG] detail for id 1304807656',
'2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740']
for line in file :
    num = re.findall(r'\d+', line)[-1]
    if(num == '1740'):
        print(line)

Output :

2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740

With this code, you will find the last number that occurs on each line even if the string is not ending with a number.

The following will check if 1740 occurs anywhere in the line.

import re
file = ['2012-02-03 18:35:34 SampleClass6 [INFO] everything normal for id 174025851',
'2012-02-03 18:35:34 SampleClass4 [FATAL] system problem at id 1991740254',
'2012-02-03 18:35:34 SampleClass3 [DEBUG] detail for id 1304807656',
'2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740',
'2012-02-03 19:11:02 id 1740 SampleClass5 [TRACE] verbose detail ']
for line in file :
    num = re.findall(r'\d+', line)
    if('1740' in num):
        print(line)

Or if you are sure that each line ends with a number then you can simple split the string and compare with the last element of the split as -

file = ['2012-02-03 18:35:34 SampleClass6 [INFO] everything normal for id 174025851',
'2012-02-03 18:35:34 SampleClass4 [FATAL] system problem at id 1991740254',
'2012-02-03 18:35:34 SampleClass3 [DEBUG] detail for id 1304807656',
'2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740']

for line in file :
    num = line.split()[-1]
    if(num == '1740'):
        print(line)

Output :

2012-02-03 18:35:34 SampleClass3 [WARN] missing id 1740

edited Jun 23, 2020 at 18:00

answered Jun 23, 2020 at 17:32

Abhishek Bhagate

5,8063 gold badges18 silver badges33 bronze badges

5 Comments

Zaman Azam Over a year ago

id can be anywhere in file. like in middle of strin line.

Abhishek Bhagate Over a year ago

Then the regex code would work for you(provided its the last number that occurs on that line)

Zaman Azam Over a year ago

like id can appears like this in file,

Zaman Azam Over a year ago

2012-02-03 19:11:02 id 1740 SampleClass5 [TRACE] verbose detail

Abhishek Bhagate Over a year ago

@ZamanAzam See if the second one that I added now helps. It works for the case you provided

Collectives™ on Stack Overflow

python find digits string in line

7 Answers 7

UPDATE

8 Comments

Comments

Comments

Comments

1 Comment

6 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

UPDATE

8 Comments

Comments

Comments

Comments

1 Comment

6 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related