In RQ based environment, anew worker will be forked every time work is assigned. RQ based worker works in an independent context every time.
Actually we can work on rq without forking. We can use
SimpleWorker
to avoid forking.
Many times we will have multiple kinds of works running in the RQ environment. If we can maintain a context across runs, the work will be simplified. And can do some interesting workflows with ease.
In the article Handling multiple job dependencies in RQ, we come across an approach to use context across runs. Here, we will save the context into Redis and retrieve it whenever required. The same will be saved back after modification. One disadvantage to this approach is that, at a given point in time, only one worker should modify the context. We have to do some circus to maintain the workflow with this.
Paradigm change
If we avoid modify and save
the context, we can run multiple mutually exclusive
workers with the same context in parallel. Do we need to modify a context?
Can we avoid the modifications
in the context?
The answer is ‘yes’. Instead of modifying, we can write the new context with a timestamp field or counter field in the context key. The extra field can be used to identify the chronological order of context appends.
To get the latest context, we can construct from the list of split context using the chronological order.
If we design our context in a more efficient way, we can avoid the construction of the latest context.
- Save the initial information as a base context
- When the new work is started in the same context, identify the bare minimum information to be saved as context append. Avoid the information which can be derived from previous contexts.
- Using
lock
(using Redis) we can avoid conflicting execution even with this split context approach.
If the context span a long time and the number of split-contexts grows, it will be better to combine/merge the contexts periodically. Use lock
to avoid conflicts while merging.
Redis based lock
Redis runs the commands on a single-threaded environment. This makes Redis operations atomic. We can acquire a lock by using the incr
operation of a well-defined key. If the incr
operation returns value 1 the lock is acquired. If not, add a wait loop or return failure to lock. The release
ing involves deleting the key from Redis.
Finally…
We wrote an ML workflow using this split-context concept. We used actor-model kind of jobs with each job will have its own context. The contexts will be updated when the job receives a message.