SQL count on multiple columns?

Question

If I have two columns in my table say, first_name and last_name and I want to find out how many people share the same name for example:

Name Count | Name
-------------------------
 12        | John Smith
  8        | Bill Gates
  4        | Steve Jobs

user359040 · Accepted Answer · 2012-12-13 14:53:57Z

3

Group by both columns - eg:

select firstname, lastname, count(*) as `Name Count`
from table
group by firstname, lastname

answered Dec 13, 2012 at 14:53

user359040

Sign up to request clarification or add additional context in comments.

2 Comments

spencer7593 Over a year ago

+1. This is the most elegant solution. Might want to add a note that in case the collation for the firstname and/or lastname columns is case sensitive, then the OP could wrap them in a LOWER() or UPPER() function to make it case insensitive -- SELECT LOWER(firstname) AS firstname, LOWER(lastname) AS lastname, ...

user359040 Over a year ago

@spencer7593: I think you have just added that note for me - it's also implicit in Anton's answer. Other collation issues may also be significant, such as whether accented versus unaccented characters should affect the results - this is something that the OP should be best qualified to answer.

Anton · Accepted Answer · 2012-12-13 14:55:06Z

1

Since names can have different capitalizations (i.e. 'John' and 'john'), and possibly excess spaces in the database, first use a subquery that cleans up and concatenates the first and last names, and then use COUNT and GROUP BY:

SELECT COUNT(*) AS `name_count`
FROM (
    SELECT CONCAT(LOWER(TRIM(`first_name`)), ' ', LOWER(TRIM(`last_name`))) AS `full_name`
    FROM `table`
) AS `table_with_concat_names`
GROUP BY `full_name`
ORDER BY `name_count` DESC;

You'll notice I applied LOWER(TRIM()) to both the first and last names. This way, they are made all lowercase with LOWER() so that 'John Smith' and 'john smith' are the same person when compared, and also I used TRIM() to remove excess spaces, so 'John Smith ' (space after) and 'John Smith' are the same person too.

answered Dec 13, 2012 at 14:55

Anton

4,02829 silver badges40 bronze badges

9 Comments

Diego Over a year ago

I believe that cleaning the data is not a job for the queries. If data is dirty, it should stay as such. Also, GROUP BY can be case insensitive (it can be specified via the collation), which makes the usage of LOWER() superflous.

Anton Over a year ago

@Diego Great point about the GROUP BY. I didn't know what settings the asker had so I was going for the catch-all solution.

Anton Over a year ago

@Diego As far as cleaning goes, I have to disagree. In a regular application, you're right, but this sounds like an ad-hoc one-time query for the user, so it's just easy to run in SQL and let the sanitization happen on-the-fly in the query. Keeping the data dirty means having to perform such standardization in PHP later, which will make things more complicated for the asker (I'm pretty sure they're still a beginner).

Diego Over a year ago

I still believe that data should be taken as-is, and cleaned during input (if ever). Cleaning it on the fly would mean repeating the same operation over and over. It starts with a TRIM(), it ends in a complicated REGEX() to deal with "J ohn", "Jhon", "jjohn", "Johnn", "Jon" and so on. Especially because OP seems to be a beginner, it's important that he/she learns to properly sanitise data before saving it to the database.

Anton Over a year ago

Yes but it sounds like the data already exists. I do agree that new ta should be sanitized during input.

|

ajp · Accepted Answer · 2012-12-13 14:51:19Z

0

Use a group by clause

select (firstname + ' ' + lastname) as Name, count(*) as 'Name Count'
from table
group by (firstname + lastname)

answered Dec 13, 2012 at 14:51

ajp

1,4902 gold badges13 silver badges25 bronze badges

1 Comment

spencer7593 Over a year ago

In MySQL, the + operator does not perform string concatenation, it performs numeric addition. For values of firstname and lastname that do not begin with a digit (as the first non-space character), the expression is going to return a 0. It's very likely this query will return one row, and not be the resultset the OP specified.

Collectives™ on Stack Overflow

SQL count on multiple columns?

3 Answers 3

2 Comments

9 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

9 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related