I am using Cygwin (bash) to create a script to find, group and count fields in multiple CSV files. Each row will have comma-separated fields with each field following a similar convention. There is a numeric value, then an equal sign (=), then an alphanumeric value. The "(number)=" may or may not be present in a row and, if present, the field position may vary, but appear only once in the row. In addition, the value after the equal sign will vary in length.
An example of my objective will be best. CSV File:
35=D,11=ABCD1,1=ABC,55=XYZ,38=100,40=P,18=M,54=1,59=0,10=111
35=D,11=ABCD2,1=ABC,55=XYZ,38=200,40=P,18=M,54=1,44=10.00,59=0,10=133
35=D,11=ABCD3,1=ABC,55=XYZ,38=300,40=P,18=M B,54=1,44=10.00,59=0,110=200,10=113
35=D,11=ABCD4,1=ABC,55=XYZ,38=400,40=P,18=M B F,54=1,44=10.00,59=0,110=300,10=144
35=D,11=ABCD5,1=ABC,55=ZYX,38=300,40=2,54=1,44=10.00,59=3,10=132
35=D,11=ABCD6,1=ABC,55=QQQ,38=100,40=1,18=C,54=2,59=3,10=131
The "18=" field values are space-separated. I would like to have a script or one-liner that would identify each unique "18=" value and then count the appearance of each. The output using the above file would be (sort is optional):
18=M 2
18=M B 1
18=M B F 1
18=C 1
As mentioned, this script should read a number of files with records in this format. I have tried different grep combinations and dabbled with awk, but I am less familiar with its proper implementation.
The first two answers do work (thanks a lot!). Would it be possible to expand to aggregate the "38=" values grouped by the unique "18=" count results?