group and count in python

Question

I am trying to get group by and count in python. It does not seem to group for some reason

Using python 2.7

#!/usr/bin/env python
counts = {}
logfile = open("/tmp/test.out", "r")

for line in logfile:
    if line.startswith("20") in line:
        seq = line.strip()
        substr = seq[0:13]
        if substr not in counts:
            counts[substr] = 0
            counts[substr] += 1
            for substr, count in counts.items():
                print(count,substr)

I would like output like below grouped by count

 6 2019-06-17T00
 13 2019-06-17T01
  9 2019-06-17T02
  7 2019-06-17T03
  6 2019-06-17T04

Can you a sample of the file's contents and the output you're getting for it? — Mureinik
– Mureinik, Commented Jun 21, 2019 at 11:14
The file got many random lines..I am picking up only lines like below 2019-06-19T09:56:04.378+0000: [Times: user=153.84 sys=1.15, real=18.13 secs] 2019-06-19T09:59:46.370+0000: [Times: user=154.93 sys=1.24, real=18.65 secs] 2019-06-19T10:00:05.074+0000: [Times: user=155.21 sys=1.39, real=20.03 secs] — user345270
– user345270, Commented Jun 21, 2019 at 11:18
I am interested in only the hour and the counts of the occurrences..thanks — user345270
– user345270, Commented Jun 21, 2019 at 11:19
and I am getting the below output and it not grouped ('2019-06-16T10', 1) ('2019-06-15T19', 1) ('2019-06-16T13', 1) ('2019-06-16T12', 1) — user345270
– user345270, Commented Jun 21, 2019 at 11:20

Sam Hollenbach · Accepted Answer · 2019-06-21 11:32:17Z

2

You have the substring incrementing indented one block too far

for line in logfile:
    if line.startswith("20") in line:
        seq = line.strip()
        substr = seq[0:13]
        if substr not in counts:
            counts[substr] = 0
        # Un-indented below
        counts[substr] += 1

# Print output only after loop completes
for substr, count in counts.items():
    print(count,substr)

Before you would only do the increment if the substring was not in the count dictionary.

edited Jun 21, 2019 at 11:32

answered Jun 21, 2019 at 11:22

Sam Hollenbach

6724 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Sam Hollenbach Over a year ago

@MadPhysicist I agree unless they want to see the progress on each iteration, so I left it there.

user345270 Over a year ago

Thanks Sam that works..the out seems to be looping continuously

('2019-06-17T05', 1) ('2019-06-17T03', 7) ('2019-06-17T07', 1) ('2019-06-17T06', 3) ('2019-06-16T02', 1) ('2019-06-16T00', 1) ('2019-06-17T02', 10)

Mad Physicist Over a year ago

The desired output does not indicate that they want to see it at every iteration

user345270 Over a year ago

Looking for something uniq count...I do this in unix to achieve this..awk '{print substr($1,1,13)}' | sort | uniq -c

Sam Hollenbach Over a year ago

@user345270 Does this answer work properly now? Is there something that is not working still?

|

Shreya Gupta · Accepted Answer · 2019-06-21 11:27:17Z

0

counts = {}
logfile = open("/tmp/test.out", "r")

for line in logfile:
    if line.startswith("20") in line:
        seq = line.strip()
        substr = seq[0:13]
        if substr not in counts:
            counts[substr] = 0
        counts[substr] += 1
for substr, count in counts.items():
    print(count,substr)

I think this would work

answered Jun 21, 2019 at 11:27

Shreya Gupta

6710 bronze badges

7 Comments

Mad Physicist Over a year ago

Why should that work? An explanation is more valuable than the solution, especially when your solution is basically a copy of the existing answer.

Shreya Gupta Over a year ago

basically at the end of complete iteration of the file that you we opened we want to print the number of the times we encounter the string that starts with "20".

Shreya Gupta Over a year ago

what your solution is does is it prints the string in each iteration, so if there are 50 lines in the file so your loop iterate through 50 times and print each time ...it you want to do so that why are you counting the string.

Shreya Gupta Over a year ago

@MadPhysicist you updated your solution after seeing my solution so basically you copied my solution.

Shreya Gupta Over a year ago

@user345270 if this solution worked for you than you may mark this solution as write.

|

Collectives™ on Stack Overflow

group and count in python

2 Answers 2

7 Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related