3

When I try to use Aerospike client Write() I obtain this error:
22 AS_PROTO_RESULT_FAIL_FORBIDDEN

The error occurs only when the Write operation is called after a Truncate() and only on specific keys. I tried to:

  • change the key type (string, long, small numbers, big numbers)
  • change the Key type passed (Value, long, string)
  • change the retries number on WritePolicy
  • add a delay (200ms, 500ms) before every write
  • generate completely new keys (GUID.NewGuid().ToString())

None solved the case so I think the unique cause is the Truncate operation.

The error is systematic; for the same set of keys fail exactly on the same keys.

The error occurs also when after calling the Truncate I wait X seconds and checking the Console Management the Objects number on the Set is "0" .

I have to wait minutes (1 to 5) to be sure that running the process the problem is gone.

The cluster has 3 nodes with replica factor of 2. SSD persistence

I'm using the NuGet C# Aerospike.Client v 3.4.4

Running the process on a single local node (docker, in memory) does not give any error.

How can I know when the Truncate() process (the delete operation behind it) is completely terminated and I can safely use the Set ?

[Solution]
As suggested our devops checked the timespan synchronization. He found that the NTP was not enabled on the machine images (by mistake).
Enabled it. Tested again. No more errors.

Thanks,

Alex

1
  • 1
    There shouldn't be any need to wait for the truncate process to finish. Once truncate is applied to a set or namespace, all applicable records will immediately be treated as deleted (the counters will take a time to reflect this). Commented Sep 29, 2017 at 16:19

1 Answer 1

3

Sounds like a potential issue with time synchronization across nodes, make sure you have ntp setup correctly... That would be my only guess at this point, especially as you are mentioning it does work on a single node. The truncate command will capture the current time (if you don't specify a time) and will use that to prevent records written 'prior' to that time from being written. Check under the (from top of my head, sorry if not exactly this) /opt/aerospike/smd/truncate.smd to see on each node the timestamp of the truncated command and check the time across the different nodes.

[Thanks @kporter for the comment. So the time would be the same in all truncate.smd file, but a time discrepancy between machine would then still cause writes to fail against some of the nodes]

Sign up to request clarification or add additional context in comments.

5 Comments

The timestamp would be the same on all nodes even if there is skew. There is a namespace level current_time that the user can poll over asadm. asadm -e 'show stat like current_time'
Also if the nodes time are not synchronized I don't understand the error. By default (and I checked the value) the WritePolicy recordExistsAction is UPDATE, so there should be no problem if record are still there. The error description says: " Operation not allowed at this time. For writes, the set is in the middle of being deleted, or the set's stop-write is reached;" but I checked and the Set is EMPTY. The problem exists also for 100 records. 44 id (integer from 1 to 100) fail the Write. It is like the Set is locked for a certain amount of time but ONLY for specific keys.
No locks on the set or on specific keys. But if the time is skewed between nodes, and you issued the truncate on one node let's say at time t, this propagates to other nodes, but if that initial node's time was ahead of the others, the other node would then prevent any records from being written until its clock advances pass the time that was set by the truncate command... In other words, the truncate command prevents records from being written 'in the past'.
Ok, now it is clear. But... sound strange. Are you saying that if I call the Truncate specifying a time T (that usually should be in the past) with a future time, this will prevent the set to be written? This sound weird!.
If the write before the truncate LUT were to be permitted then the record would be deleted later because it is before the truncate LUT.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.