This code suffers from over-generalization, the sensibility that you need to account for every possible rendering task. This is not useful from a performance perspective.
For example, take your VAO here. Each rendered object gets its own VAO. Why? Lots of rendered objects share the same vertex format, and many of them will share the same bound buffers. When rendering such objects, you shouldn't change the VAO between them.
Consider your sprite example. You create a single quad, and render different sprites via separate uniforms. This is terrible from a performance standpointstandpoint; that's dozens if not hundreds of draw calls per-frame, with state changes between the draws. You'll get better performance by just writing each sprite's data to a buffer object (using appropriate streaming techniques, of course), and then rendering them all at once with a single draw command. Oh sure, each sprite will have duplicate color information for its four vertices. But so what? Even a 16MB double-buffer (8MB per buffer) should be able to store 400,000+ sprites.
You shouldn't be sticking all objects into a queue. You should be separating your rendering into hard-coded types of objects: skinned meshes, non-skinned meshes, sprites, lines, etc. Each type of object has its own effective queue, and each queue has its own kind of data.
For example, all sprites should source any texture data from a single texture atlas, and they should all render with the same shader. As such, there doesn't need to be any individual per-sprite data; that's all stored in the buffer object. So the only thing the sprite queue needs to know is which part of the buffer to read from and how many vertices the rendering command ought to have.