-
Notifications
You must be signed in to change notification settings - Fork 178
mv iterator hot threads
- merged to develop -
- code complete -
- development started - June 16, 2016
This fix is independent of recent other iterator hang corrections. This fix deals with iterator specific code within Basho's hot threads feature. The iterator specific code was only partial ported when the eleveldb hot threads code and leveldb hot threads code merged.
eleveldb's iterator objects, MoveItems, are reusable. MoveItems communicate the reuse desire to the hot threads logic via the resubmit() property. When resubmit() returns true, hot threads executes the same task again immediately. This is how eleveldb's iterators implement prefetch iterations (read of next iterator key/value in background while Erlang processes the current key/value).
Prior to merging eleveldb's hot threads with leveldb's hot threads, only eleveldb's code supported the resubmit() property. The support required an extra five lines of code within the thread loop routine. Unfortunately, leveldb had two thread loop routines. Only one of the two received the extra five lines during the merge. This branch adds the five lines supporting the resubmit() property to leveldb's second thread loop.
Five code lines from HotThread::ThreadRoutine() now also exist in QueueThread::QueueThreadRoutine(). The block of code begins with "if (submission->resubmit())".
The long term problem is that these two thread routines exist in parallel to address a race condition. Proper defensive code would eliminate the need for such parallel routines. Item #2 in leveldb's github issue #181 addresses replacement logic that will possibly remove the need for the QueueThreadRoutine():
MoveItem's resubmit() only happens if the Erlang thread asks for the next record before the prefetch has it. The code is never more than one fetch ahead.
In contrast when the eleveldb thread is faster, it has the fetch before Erlang asks for it and the MoveTask object "pauses". The Erlang thread in eleveldb.cc takes the MoveTask's data and starts the next prefetch via normal means, not resubmit().
For the bug to hit:
A. the iterator move operation in workitems.cc has to complete after Erlang returns to ask for that record, B. and there have to be enough work tasks that all hot threads are busy, C. and the semaphore thread has to wake first to get the task.
High disk activity increases the likelihood of condition A. Linux "nice" reduces chance of condition A due to background compaction.
Bug reproduction required lowering the worker threads in eleveldb from 71 to 1. Then placing a 30 millisecond pause in MoveItem::DoWork() before the compare and swap. This caused the Erlang thread to always "win" the race. The eleveldb thread would then send the response via an Erlang message and call resubmit().