1

I'm writing a Minecraft clone of sorts and I really want to implement multithreading for things like world generation, updating blocks, light propagation etc. I store all the loaded chunks in a hash map "chunk_map".

Putting a mutex on a chunk_map would defeat the whole purpose of multithreading since most of each thread's work is iterating over chunk_map.

If what i'm thinking is correct, inserting a new chunk into the map shouldn't be problem (in the worst case scenario a thread might skip a chunk that has just been added) But deleting a chunk would definitely be a problem.

Would making a hash map implementation that uses shared_ptr instead of iterator_type solve the problem of removing am element from the map while other thread iterates over that map? Or is there some different, easier approach?

Edit: I want to avoid synchronizing threads entirely as I don't want the world generation, block updates, etc. to cap the rendering performance.

I want the main thread to render all the chunks that are currently loaded. Also I want the "updater thread" to update each block in every loaded chunk, etc. And the "world thread" to load and deload chunks.

5
  • "inserting a new chunk into the map shouldn't be problem" << this is incorrect. Inserting a new item into a hashmap invalidates all iterators if rehashing occurs. Commented Jul 25, 2019 at 15:50
  • @David Thank you for your answer! I didn't know that and I guess that defeats the purpose of using a hash map Commented Jul 25, 2019 at 15:53
  • 1
    np. It's difficult to answer your actual question without more context btw, might be worth describing what you are trying to achieve (seems like it's an XY question meta.stackexchange.com/questions/66377/what-is-the-xy-problem) Commented Jul 25, 2019 at 15:57
  • @David Question edited. Commented Jul 25, 2019 at 16:27
  • Just to make sure you are aware of this: std::map is not a hashmap, but a balanced tree (typically implemented as a red-black tree). If you want a hash map, use std::unordered_map instead. Commented Jul 26, 2019 at 15:06

3 Answers 3

3

Your question is, unfortunately, based on a false premise:

If what i'm thinking is correct, inserting a new chunk into the map shouldn't be problem (in the worst case scenario a thread might skip a chunk that has just been added) But deleting a chunk would definitely be a problem.

No, you're wrong. The worst case is much worse than that. Consider:

  1. Thread A constructs a new chunk.
  2. Thread A adds that chunk to the map.
  3. Thread B sees the entry for the new chunk in the map.
  4. Thread B tries to access the chunk, but it doesn't see the writes thread A made in step 1 due to optimizations in the code and in the hardware that re-order the writes.

Boom, your program just crashed.

The key flaw in your reasoning, and this is very common flaw that you absolutely must fix in your reasoning if you want to be a successful programmer, is thinking that if you break explicit rules, the only things that can go wrong are things you can foresee. This line of reasoning is absolute death in the computer programming field.

So, in fact, the absolute worst case is even worse than what I said above. You are violating the requirements of a class you are using, a class whose implementation will vary across platforms. There is absolutely no way you can know what will go wrong.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, I will keep that in mind in the future. This is my first attempt into writing anything at least remotely more advanced and I did misunderstand quite a lot of things. I now made sure that no resources are accessed at the same time by more than one thread.
Good advice in the middle paragraph
1

You can generate your chunks (Update blocks or whatever) on multiple threads (using std::packaged_task or std::async), and then copy the results into your map using the main thread.

The longest part in your case isn't the map access, it's the data processing.

2 Comments

I agree that the best solution would not involve a shared resource but a resource for every thread. The critical part becomes the algorithm then, not thread-safety.
Thanks for your answer, I ended up making each thread use its own copy of the map and it works fine
0

Deep reading of the container specs is required for the containers to work out what write operations need to be mutexed, and what read operations should be safe even during a write operation.

[Ah you said hashmap]

In a nutshell:

Hashmaps, vectors, and strings are a total pain. It is possible to do stuff that keeps your object pointers good (like pre-allocation), but even single threaded it is hard work to keep references after an insert or delete.

For the traditional maps, sets, lists:

Construction and destruction of the container has to be exclusive, obviously.

If you aren't doing any mods to your map, you can multithread as many reads as you like, of course. As long as you use find() and not operator[] you will be fine.

The behaviour of map and list If you are inserting new objects, then any existing iterators remain valid, but you can't find() or next() (or other iterator arithmetic) during the insert() (assuming the insert() succeeds, so it is often worth doing a find before the insert), and you can't do multiple (successful) inserts in parallel. So you need a 1 writer multi reader locking model: This can be as simple as an atomic find/iterator++ counter and a first/last find vs any insert/delete mutex.

If you are deleting, then , then any iterator pointing to the deleted object becomes invalid, which is also true in single-threaded, but the parallel usage becomes more problematic. Your code needs to have absolute certainty that you don't have any iterators to the deleted element. Depending on your use-case for delete this may not be an issue, or it may be a design killer. I don't see why you would want to delete a resource that you still need/use in another thread, but then I also don't know how your code will know if it is needed. You might need atomic usage counters in each instance, for example, to know which instances are safe to delete at any moment.

If you are doing other mutate operations, you will have to make your own design decisions. These are the main operations, and how I feel they are safe to use.

You can cut it tighter than this with deeper understanding, but these are good performance vs complexity compromises generally.

This applies to most of the containers except vector and string -also hashmap, but for different reasons. But that is because even in single threaded in both cases, insert operations will invalidate existing iterators completely if the storage is moved, and insert and delete operations invalidate any iterators at a higher index. With hashmap the issue is following a rebucket operation pretty much everything you knew before is invalid.

Use of map rather than hashmap in this case can be easier to efficiently lock, but you then have to decide whether the log(n) general performance hit is worth the general low-locking benefit. Alternatively, arrange that your hashmap only contains pointers, so that your code can safely keep pointers to the actual objects rather than iterators or pointers to the contained objects - whicgh are not safe in hashmaps, and blind lock all indexing calls.

6 Comments

@FrançoisAndrieux - which is why you need the locking. But strategic locking can produce much better performance than blind locking all reads and writes, and the actual threading weaknesses of the stl map and list are properly documented. Use of map vs hashmap in this case can be easier to efficiently lock, and you then have to decide whether the log(n) general performance hit is worth the general low-locking benefit. Or, your hashmap only contains pointers, so that your code can safely keep pointers to the actual objects.
@FrançoisAndrieux Hmm, I'm curious that you feel the iterator invalidation rules don't apply to mutlithreading? Are you saying that if the rules say your iterator is good when a particular write op is done, it could potentially not be good if the same write operation is done in a different thread; or that the iterated to object is unsafe during the write operation, but good before and after? Both readings seem somewhat paranoid?
@FrançoisAndrieux Is this not exactly what I say? Ok, I may have only covered certain operations, but I say that insert and delete operations must be synchronised with each other and with iteratior math on a one-writer & many-reader basis?.
I cover these operations because they are the ones I commonly do.
Ok, either I misunderstood what you wrote, or something changed in the edits (I didn't go back to read all of the revisions). I guess when reading "Deep reading is required for the containers to work out what write operations need to be mutexed, and what read operations should be safe even during a write operation." my first interpretation was that "read" and "write" operations relate to the container and not the elements. With that interpretation, the statement is wrong because the rules are simple. Perhaps you just need to clarify that part to make it clear you mean read/write to elements.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.