Counting lines of code?

Question

if I want to count the lines of code, the trivial thing is

cat *.c *.h | wc -l

But what if I have several subdirectories?

Off-topic: Why the unnecessary cat? wc -l *.c *.h does the same thing. — Thomas Padron-McCarthy
– Thomas Padron-McCarthy, Commented Jun 8, 2016 at 17:50
@ThomasPadron-McCarthy No it doesn't. You'd need wc -l *.c *.h | tail -n 1 to get similar output. — Gilles 'SO- stop being evil'
– Gilles 'SO- stop being evil', Commented Jun 8, 2016 at 22:16
Note that some (possibly even most) modern shells (Bash v4, Zsh, probably more) provide a recursive-globbing mechanism using **, so you could have used wc -l **/*.{h,c} or something similar. Note that in Bash, at least, this option (called globstar) is off by default. But also note that in this particular case, cloc or SLOCCount is a much better option. (Also, ack may be preferable to find for easily finding/listing source files.) — Kyle Strand
– Kyle Strand, Commented Jun 8, 2016 at 22:31
wc -l counts lines, not lines of code. 7000 blank lines will still show up in wc -l but wouldn't count in a code metric. (comments too usually don't count) — coteyr
– coteyr, Commented Jun 9, 2016 at 8:38

Ho1 · Accepted Answer · 2016-06-09 10:38:15Z

76

The easiest way is to use the tool called cloc. Use it this way:

cloc .

That's it. :-)

edited Jun 9, 2016 at 10:38

answered Jun 8, 2016 at 16:24

Ho1

2,6743 gold badges22 silver badges25 bronze badges

2

-1 because this program doesn't have any way to recognise lines of code in languages outside of its little, boring brain. It knows about Ada and Pascal and C and C++ and Java and JavaScript and "enterprise" type languages, but it refuses to count the SLOC by just file extension, and is thus completely useless for DSLs, or even languages it just happens to not know about.

cat
– cat

2016-06-09 11:39:39 +00:00
Commented Jun 9, 2016 at 11:39
29

@cat Nothing is perfect, and nothing can fulfill all your past and future demands.

Ho1
– Ho1

2016-06-09 12:26:33 +00:00
Commented Jun 9, 2016 at 12:26
2

Well, the programming language which CLOC refuses to acknowledge does indeed fulfill all my past and future demands :)

cat
– cat

2016-06-09 12:28:31 +00:00
Commented Jun 9, 2016 at 12:28
12

@cat according to the CLOC documentation it can read in a language definition file, so there is a way to get it to recognize code in languages it hasn't defined. Plus it's open source, so you can always extend it to make it better!

Centimane
– Centimane

2016-06-15 18:28:28 +00:00
Commented Jun 15, 2016 at 18:28

Add a comment |

Stephen Kitt · Accepted Answer · 2020-05-26 13:23:01Z

44

You should probably use SLOCCount or cloc for this, they're designed specifically for counting lines of source code in a project, regardless of directory structure etc.; either

sloccount .

or

cloc .

will produce a report on all the source code starting from the current directory.

If you want to use find and wc, GNU wc has a nice --files0-from option:

find . -name '*.[ch]' -print0 | wc --files0-from=- -l

(Thanks to SnakeDoc for the cloc suggestion!)

edited May 26, 2020 at 13:23

answered Jun 8, 2016 at 11:40

Stephen Kitt

482k60 gold badges1.2k silver badges1.4k bronze badges

+1 for sloccount. Interestingly, running sloccount /tmp/stackexchange (created again on May 17 after my most recent reboot) says that the estimated cost to develop the sh, perl, awk, etc files it found is $11,029. and that doesn't include the one-liners that never made it into a script file.

cas
– cas

2016-06-08 11:50:15 +00:00
Commented Jun 8, 2016 at 11:50
12

Estimating cost based on lines of code? What about all the people employed to re-factor spaghetti into something maintainable?

OrangeDog
– OrangeDog

2016-06-08 16:08:31 +00:00
Commented Jun 8, 2016 at 16:08
@OrangeDog you could always try to account for that in the overhead; see the documentation for an explanation of the calculation (with very old salary data) and the parameters you can tweak.

Stephen Kitt
– Stephen Kitt

2016-06-08 16:12:17 +00:00
Commented Jun 8, 2016 at 16:12
@StephenKitt> still, the main issue is it's counting backwards. When cleaning up code, you often end up with less lines. Sure you could try to handwave an overhead to incur on the rest of the code to account for the removed one, but I don't see how it's better than just guessing the whole price in the first place.

spectras
– spectras

2016-06-09 06:51:03 +00:00
Commented Jun 9, 2016 at 6:51

Add a comment |

heemayl · Accepted Answer · 2016-06-08 11:51:41Z

17

As the wc command can take multiple arguments, you can just pass all the filenames to wc using the + argument of the -exec action of GNU find:

find . -type f -name '*.[ch]' -exec wc -l {} +

Alternately, in bash, using the shell option globstar to traverse the directories recursively:

shopt -s globstar
wc -l **/*.[ch]

Other shells traverse recursively by default (e.g. zsh) or have similar option like globstar, well, at least most ones.

answered Jun 8, 2016 at 11:51

heemayl

58.1k9 gold badges129 silver badges144 bronze badges

Add a comment |

Stéphane Chazelas · Accepted Answer · 2016-06-10 06:31:12Z

5

If you are in an environment where you don't have access to cloc etc I'd suggest

find -name '*.[ch]' -type f -exec cat '{}' + | grep -c '[^[:space:]]'

Run-through: find searches recursively for all the regular files whose name ends in either .c or .h and runs cat on them. The output is piped through grep to count all the non-blank lines (the ones that contain at least one non-spacing character).

edited Jun 10, 2016 at 6:31

Stéphane Chazelas

586k96 gold badges1.1k silver badges1.7k bronze badges

answered Jun 10, 2016 at 6:22

Kotte

2,60724 silver badges26 bronze badges

Add a comment |

Vombat · Accepted Answer · 2016-06-08 12:00:47Z

4

You can use find together with xargs and wc:

find . -type f -name '*.h' -o -name '*.c' | xargs wc -l

edited Jun 8, 2016 at 12:00

answered Jun 8, 2016 at 11:36

Vombat

13.3k14 gold badges47 silver badges58 bronze badges

2

(that assumes file paths don't contain blanks, newlines, single quote, double quote of backslash characters though. It may also output several total lines if several wcs are being invoked.)

Stéphane Chazelas
– Stéphane Chazelas

2016-06-09 09:16:09 +00:00
Commented Jun 9, 2016 at 9:16
Perhaps the several wc commands problem can be addressed by piping find to while read FILENAME; do . . .done structure. And inside the while loop use wc -l. The rest is summing up the total lines into a variable and displaying it.

Sergiy Kolodyazhnyy
– Sergiy Kolodyazhnyy

2016-06-09 11:14:19 +00:00
Commented Jun 9, 2016 at 11:14

Add a comment |

Community · Accepted Answer · 2020-06-11 14:16:50Z

As has been pointed out in the comments, cat file | wc -l is not equivalent to wc -l file because the former prints only a number whereas the latter prints a number and the filename. Likewise cat * | wc -l will print just a number, whereas wc -l * will print a line of information for each file.

In the spirit of simplicity, let's revisit the question actually asked:

if I want to count the lines of code, the trivial thing is
cat *.c *.h | wc -l
But what if I have several subdirectories?

Firstly, you can simplify even your trivial command to:

cat *.[ch] | wc -l

And finally, the many-subdirectory equivalent is:

find . -name '*.[ch]' -exec cat {} + | wc -l

This could perhaps be improved in many ways, such as restricting the matched files to regular files only (not directories) by adding -type f—but the given find command is the exact recursive equivalent of cat *.[ch].

Stéphane Chazelas · Accepted Answer · 2016-06-09 09:17:21Z

3

Sample using awk:

find . -name '*.[ch]' -exec wc -l {} \; |
  awk '{SUM+=$1}; END { print "Total number of lines: " SUM }'

edited Jun 9, 2016 at 9:17

Stéphane Chazelas

586k96 gold badges1.1k silver badges1.7k bronze badges

answered Jun 8, 2016 at 11:38

Lambert

12.8k2 gold badges28 silver badges35 bronze badges

Use + in place of \;.

Jonathan Leffler
– Jonathan Leffler

2016-06-09 13:39:13 +00:00
Commented Jun 9, 2016 at 13:39
@JonathanLeffler Why?

Hastur
– Hastur

2016-06-09 17:56:59 +00:00
Commented Jun 9, 2016 at 17:56
1

@Hastur: It runs wc -l for groups of files, rather like xargs does, but it handles odd-ball characters (like spaces) in file names without needing either xargs or the (non-standard) -print0 and -0 options to find and xargs respectively. It's a minor optimization. The downside would be that each invocation of wc would output a total line count at the end when given multiple files — the awk script would have deal with that. So, it's not a slam-dunk, but very often, using + in place of \; with find is a good idea.

Jonathan Leffler
– Jonathan Leffler

2016-06-09 18:01:09 +00:00
Commented Jun 9, 2016 at 18:01
@JonathanLeffler Thank you. I agree. My concerns, however, were about the length of the parameter string passed to wc. If unknown a priori the number of files that will be found, is there the risk to pass that limit or somehow is it handled by find?

Hastur
– Hastur

2016-06-09 19:45:25 +00:00
Commented Jun 9, 2016 at 19:45
2

@Hastur: find groups the files into convenient size bundles, which won't exceed the length limit for the argument list on the platform, allowing for the environment (which comes out of the argument list length — so the length of the argument list plus the length of the environment has to be less than a maximum value). IOW, find does the job right, like xargs does the job right.

Jonathan Leffler
– Jonathan Leffler

2016-06-09 19:48:15 +00:00
Commented Jun 9, 2016 at 19:48

Add a comment |

malyy · Accepted Answer · 2016-06-08 12:25:08Z

1

easy command:

find . -name '*.[ch]' | xargs wc -l

answered Jun 8, 2016 at 12:25

malyy

2,2571 gold badge12 silver badges9 bronze badges

2

(that assumes file paths don't contain blanks, newlines, single quote, double quote of backslash characters though. It may also output several total lines if several wcs are being invoked.)

Stéphane Chazelas
– Stéphane Chazelas

2016-06-09 09:16:39 +00:00
Commented Jun 9, 2016 at 9:16

Add a comment |

user254560 · Accepted Answer · 2018-03-01 03:07:48Z

0

If you're on Linux I recommend my own tool, polyglot. It's dramatically faster than cloc and more featureful than sloccount.

You should be able to build on BSD as well, though there aren't any provided binaries.

You can invoke it with

poly .

answered Mar 1, 2018 at 3:07

user254560

Add a comment |

0b1 · Accepted Answer · 2021-12-19 19:05:03Z

0

The new bid on the cloc is Loci.
Link to NPM package It counts code similarly to cloc, but is faster at scale. Also, as its natively written in nodejs it will run on all other environments without Perl (for cloc.pl) or cloc.exe.

It is in its infancy, but you can install it as an NPM CLI tool, or import it as a library into your own project.

Great for environments where you can install script-based npms, but are not allowed to use unapproved binaries

answered Dec 19, 2021 at 19:05

0b1

1

Add a comment |

John · Accepted Answer · 2016-06-15 17:13:05Z

-2

find . -name \*.[ch] -print | xargs -n 1 wc -l should do the trick. There are several possible variations on that as well, such as using -exec instead of piping the output to wc.

edited Jun 15, 2016 at 17:13

answered Jun 8, 2016 at 11:32

John

17.4k2 gold badges36 silver badges44 bronze badges

4

But find . -name \*.[ch] -print doesn't print the contents of the files, only the file names. So I count the number of files instead don't I? Do I need `xargs' ?

Niklas Rosencrantz
– Niklas Rosencrantz

2016-06-08 11:35:55 +00:00
Commented Jun 8, 2016 at 11:35
@Programmer400 yes, you'd need xargs, and you'd also need to watch for multiple wc invocations if you have lots of files; you'd need to look for all the total lines and sum them.

Stephen Kitt
– Stephen Kitt

2016-06-08 11:49:41 +00:00
Commented Jun 8, 2016 at 11:49
If you just want the total line count, you'd need to do find . -name \*.[ch] -print0 | xargs -0 cat | wc -l

fluffy
– fluffy

2016-06-08 22:28:25 +00:00
Commented Jun 8, 2016 at 22:28
Note that this (find . -name \*.[ch] -print | wc -l) counts the number of files (unless a file name contains a newline — but that's very unusual) — it does not count the number of lines in the files.

Jonathan Leffler
– Jonathan Leffler

2016-06-09 18:03:40 +00:00
Commented Jun 9, 2016 at 18:03

Add a comment |

Stack Exchange Network

Counting lines of code?

11 Answers 11

You must log in to answer this question.

Linked

Hot Network Questions

Counting lines of code?

11 Answers 11

You must log in to answer this question.

Linked

Related

Hot Network Questions