Split Task Context on python-rq Environment

Lijo Jose
2 min readJan 1, 2021

--

In RQ based environment, anew worker will be forked every time work is assigned. RQ based worker works in an independent context every time.

Actually we can work on rq without forking. We can use SimpleWorker to avoid forking.

Many times we will have multiple kinds of works running in the RQ environment. If we can maintain a context across runs, the work will be simplified. And can do some interesting workflows with ease.

In the article Handling multiple job dependencies in RQ, we come across an approach to use context across runs. Here, we will save the context into Redis and retrieve it whenever required. The same will be saved back after modification. One disadvantage to this approach is that, at a given point in time, only one worker should modify the context. We have to do some circus to maintain the workflow with this.

Paradigm change

If we avoid modify and save the context, we can run multiple mutually exclusive workers with the same context in parallel. Do we need to modify a context?

Can we avoid the modifications in the context?

The answer is ‘yes’. Instead of modifying, we can write the new context with a timestamp field or counter field in the context key. The extra field can be used to identify the chronological order of context appends.

To get the latest context, we can construct from the list of split context using the chronological order.

If we design our context in a more efficient way, we can avoid the construction of the latest context.

  • Save the initial information as a base context
  • When the new work is started in the same context, identify the bare minimum information to be saved as context append. Avoid the information which can be derived from previous contexts.
  • Using lock (using Redis) we can avoid conflicting execution even with this split context approach.

If the context span a long time and the number of split-contexts grows, it will be better to combine/merge the contexts periodically. Use lock to avoid conflicts while merging.

Redis based lock

Redis runs the commands on a single-threaded environment. This makes Redis operations atomic. We can acquire a lock by using the incr operation of a well-defined key. If the incr operation returns value 1 the lock is acquired. If not, add a wait loop or return failure to lock. The releaseing involves deleting the key from Redis.

Finally…

We wrote an ML workflow using this split-context concept. We used actor-model kind of jobs with each job will have its own context. The contexts will be updated when the job receives a message.

--

--