1

I have a mysql like:

id (UNSIGNED INT) PrimaryKey AutoIncrement
name (VARCHAR(10)
status UNSINGED INT Indexed

I use the status column to represent 32 different statuses like:

0 -> open
1 -> deleted
...
31 -> something

This is convenient to use since I do not know how many statuses I have (Now we support 32 statuses , we can use a long int to support 64, if more than 64 (highly unlikely we will see :) )

The prolem with this approach is that there is no index in the bit level -> queries selecting where a bit is set are slow.

I can improve a bit using range queries -> where status between n1 and n2 .

Still this is not a good approach.

I want to point out that I want to search only if a few of the 32 bits are set (let's say bits 0, 12 , 13, 21, 31).

any ideas to improve perfomance?

2 Answers 2

3

If for some reason you cannot normalize your data as suggested by RandomSeed in the previous answer, I'm pretty sure you can just put an index on the field and search using int values (that is 2^n).

For example if you need bit 0, 12 and 13 set, search where status = 2^0 + 2^12 + 2^13.

Edit: If you need to search where those bits are set, regardless of other bits, you could try using bitwise operators, e.g. for bits 0, 12 and 13, search where status & 1 = 1 and status & 4096 = 4096 and status & 8192 = 8192

However compared to a ranged query I'm not sure what will be the performance improvement (if any). So as said before, normalization might be the only solution.

Sign up to request clarification or add additional context in comments.

2 Comments

@rlanin Indeed this works. This works only if only the specific its are set. That's why I use range queries, I am doing the same.
Right... well in this case you can use bitwise and/or in your search query, but if you're really after performances it seems normalization is the only solution.
0

Normalize your data.

MainEntity:
id (UNSIGNED INT) PrimaryKey AutoIncrement
name (VARCHAR(10)

Status:
id (UNSIGNED INT) PrimaryKey AutoIncrement
label (VARCHAR(10))

EntityHasStatus:
entity_id (UNSIGNED INT) PrimaryKey 
status_id (UNSIGNED INT) PrimaryKey 

Entities having both statuses 1 and 5:

SELECT MainEntity.*
FROM MainEntity
JOIN EntityHasStatus AS Status1
    ON entity_id = MainEntity.id
    AND Status1.status_id = 1
JOIN EntityHasStatus AS Status5
    ON entity_id = MainEntity.id
    AND Status1.status_id = 5

Entities having either status 4 or 6:

SELECT MainEntity.*
FROM MainEntity
LEFT JOIN EntityHasStatus AS Status4
    ON entity_id = MainEntity.id
    AND Status4.status_id = 4
LEFT JOIN EntityHasStatus AS Status6
    ON entity_id = MainEntity.id
    AND Status6.status_id = 6
WHERE
    Status4.status_id IS NOT NULL
    OR Status6.status_id IS NOT NULL

These queries should be virtually instant (prefer the first form when possible, as it is a tad bit more efficient).

11 Comments

thanks for the answer. I have already thought if that but it's not possible right now.
@rlanvin This is a very valid concern. Readability is often to be prefered over raw performance. But in this cas we are specifically asked for a performance improvement. It should be fairly easy to automate the generation of such a query anyways, and a judicious comment could offset the readability issue.
If for some reason you cannot normalize your data, I'm pretty sure you can just put an index on the field and search using int values (that is 2^n). For example if you need bit 0, 12 and 13 set, search where status = 2^0 + 2^12 + 2^13. That should work fine, no?
@rlanvin, that will only work if you are searching for values where only those bits are flipped. If that's what the OP wanted, I totally misunderstood.
@gosom Then send a link to this post to your 'non-technical reason', so that I can spank him in public :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.