2

I am trying to design a notifications architecture where each notification has a UID and needs to be delivered to multiple users. Each user device has a local cache of the latest notifications. When the user device comes online it always checks for any new notifications and pulls all of them meant for that user. The device keeps the UID of the latest notification it synced and uses that UID to fetch newer notifications from the server.

I am wondering the best way to implement this in MySQL tables to make it scalable for more than 500K users.

I have a notifications details table where the notification UID is the auto increment primary key. I would need suggestions about the user mapping table which can be like (ignoring the foreign key constraints)

CREATE TABLE user_notifications_mapping ( 
    user_id INT UNSIGNED NOT NULL, 
    notification_id BIGINT UNSIGNED NOT NULL,
    UNIQUE KEY (user_id, notification_id)
) ENGINE=InnoDB;

but am skeptical if it would be the best performance while making a query like

SELECT notification_id FROM user_notifications_mapping WHERE user_id = <user-id> AND notification_id > <last-notification-uid>

1 Answer 1

1

If the table is properly indexed, this design is very suitable. Assuming that only a "small" number of notifications will be returned to one given device on synchronisation, a medium-range server will be able to handle hundreds of such requests per second, even if the table is huge (millions of rows).

Now this table is going to grow very huge. But I believe one given notification needs to be sent to one given device only once. I would consider removing (or archiving in another table) records of this table once a notification has been sent. Conceptually, this table becomes something like pending_notifications.

[edit]

Given the new information, this table is likely to grow beyond practical size. You need to take a different approach. For example, there is probably a way to group your notifications (eg. they are of a given type, or they originate from a given entity in your application). The same concept can be applied to your users: maybe you want some notifications be sent to (eg.) all "customers" or "all "administrators".

The underlying idea is to establish the n-n relationship between two entities of smaller cardinality. You wouldn't model the case "some users receive some notifications" but rather "some user groups receive some types of notifications".

Example:

  • notification can be an "Announcement", a "Notice" or a "Warning" (notification type)
  • users can be "Administrators" or "Customers" (user group)

Then the notifications_mapping table would look like this:

    +-----------------------+
    | notifications_mapping |
    +-----------------------+
    | notification_type     |
    | group_id              |
    +-----------------------+

And the corresponding query could be:

SELECT notification_id
FROM notifications_mapping AS map
JOIN user ON user.group_id = map.group_id
JOIN notifications ON notifications.type = map.notification_type
WHERE user_id = <user-id> AND notification_id > <last-notification-uid> 
Sign up to request clarification or add additional context in comments.

6 Comments

Can't remove the record because the user might have multiple devices which are synced at different time gaps. To make the server aware of all the devices is probably a more complex infrastructure. That's why there will be a lot of entries for a single user for all the notifications. That is precisely the reason I think the performance might get hit.
Thank you for suggesting the new approach. But I am not sure if this works in my case because there's no specific groups that I can create out of users. It's completely user driven, anyone can tag any number of users for a particular notification. And this itself is a special type of notification and no sub-categories possible. Also, was just wondering how good would be horizontal partitioning based on user id. Is that a good option?
"anyone can tag any number of users for a particular notification": I take it this tagging is manual, then for one given notification, only a relatively "small" number of users are likely to be notified. Now the questions are: 1. On average, for each notification, how many users will be notified? 2. How many notifications per day are you expecting? Multiply 1. by 2. and you get the number of new records in this table per day. 3. Can you really not abandon notifications if they are older than, say, 6 months?
Oh and yes, partitionning is a very valid option. Actually, now that I come to think about it, if you will handle 500k users, you wll probably need a large database infrastructure to handle it. Consider clustering.
Oh yes we will be having a backend job which will clean up the notifications and mapping table beyond 500 notifications for each user. Don't want to keep a timeline bound because some of the users might not be as active as others but probably we can assume no one will go back beyond 500 or may be 1000 notifications. Hope partitioning will work pretty fine.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.