8

I have an incredibly weird NullReferenceException being thrown when reading a value from a public Field on an object which I know exists. The basic flow is this:

Edit: I realized I forgot to mention something important, this does not happen every time I try to read the Tag value, but only somtimes, enough that I can reproduce it every time by just running the code, but not instantly when the code runs

  • Server receives a message (Worker thread)
  • The connection that sent the message get gets set as the Tag field on the message object (Worker thread)
  • The message gets put in a "ReceivedMessages" queue (a normal Queue object which is guarded by locks for serialized access) (Worker thread)
  • The message gets read (Main Thread)
  • I try to read the Tag field of the message to get the connection, this sometimes returns null and throws an exception, but when the exception gets thrown and I inspect the Message object I can see the Connection object (which is the object that is in the Tag field) there clear as day (Main Thread)

If you look at this picture, you will see it clear as day:

Weird thread behavior

You can see where I marked with the green box, I try to read the message.Tag property in three different ways, they all return null as you can see in the part marked with a blue box.

However, if you look at the two areas marked as red, you can see clear as day that the object actually exists. And, just to clear out any confusion, the part where the message gets put on the received messages queue look like this:

I as you can see I even tried doing a Thread.VolatileWrite to make sure the value gets written

message.Tag = buffer.Tag;

Thread.VolatileWrite(ref message.Tag, buffer.Tag);

if (message.Tag == null)
{
    isNullLog.Add(message.Id);
}

// Queue into received messages
lock (peer.ReceivedMessages)
{
    peer.ReceivedMessages.Enqueue(message);
}

The snippet above is all happening in the worker thread, and as you can see I copy the buffer.Tag over to message.Tag, I even setup a little runtime check for debugging which checks the message.Tag for a null value and add it's id to a list called "isNullLog" if so is the case. When the NullReferenceException gets thrown in the main thread, this list is empty.

You also see that i lock the peer.ReceivedMessages queue and push the message to the queue after i have set the message.Tag field.

Also, to be even more clear here is the function that is used to read a message out from the peer.ReceivedMessages queue:

public bool TryGetMessage(out TIncomingMessage message)
{
    lock (ReceivedMessages)
    {
        if (ReceivedMessages.Count > 0)
        {
            message = ReceivedMessages.Dequeue();
            return true;
        }
    }

    ReceivedMessageEvent.Reset();

    message = null;
    return false;
}

You can see that I lock the queue even before I check the count, and if it not is empty I set the out property and return true, otherwise I return false.

Honestly I am completely stumped, written several multi-threaded applications before and have never encountered this.

A bit of an update, I have also tried marking the Tag field as volatile, making it look like this public volatile object Tag; but this seems not to be helping.

6
  • 2
    I am not a scholar, But I think You are mixing locks with volatile reads and writes.. I think this is the problem. If you use volatile you better use it all the way.. If you are using locks then locks only.. Commented Nov 27, 2011 at 10:47
  • Hm, not sure if I agree - I have locks where all the threads meet, the volatile part doesn't seem to effect it at all (as in nothing happens if I remove it or add it). Commented Nov 27, 2011 at 10:53
  • 1
    Is there another thread which operates on the message.Tag field (e.g setting it to null)? Commented Nov 27, 2011 at 10:54
  • 1
    You are right thr, I just noticed the first line: message.Tag = buffer.Tag; Commented Nov 27, 2011 at 10:56
  • Hans there is, but it only operates on it when the message gets pushed back into the "used messages" queue, which doesn't happen until I call "TryRecycleMessage(out TIncomingMessage message)". And if the other thread was nulling it, why would it show up WITH a value in the inspector when the exception gets thrown? Commented Nov 27, 2011 at 10:57

1 Answer 1

2

I did actually fix this right now, as always when dealing with threads you need to take great care when reading/writing values. I was forgetting to clear the local message variable in the receive loop and ended up "reusing" the same message in the next loop iteration, as it has a if(message == null) { /* create new message */ } check before each iteration and when I was not clearing this the reading thread ended up trampling all over the "old" message which was stored here when trying to write a new message to it!

Sign up to request clarification or add additional context in comments.

1 Comment

You can have too many checks :) Don't bother checking or clearing message* object instance variables - just get into the habit of creating a new instance at the start and as as the very next step after queueing one off.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.