Best Practices

Moving to offlining tasks requires some shifts in the way you develop your code. There are also some good tricks/ideas for integrating Alligator.

If you have suggestions for other best practices, please submit a pull request at https://github.com/toastdriven/alligator/pulls!

Configure One Gator

This is alluded to in the Alligator Tutorial, but unless you have advanced needs, you’re probably best off configuring a single Gator instance in your code. Then you can import that instance wherever you need it.

Generally speaking, you’ll want to create a new file for just this, though if you have a utils.py or other common file, you can add it there. For example:

# Create a new file, like ``myapp/gator.py``
from alligator import Gator

gator = Gator('redis://localhost:6379/0')

Then your code elsewhere imports it:

# ``myapp/views.py``
from myapp.gator import gator

# ...Later...
def previously_slow_view(request):
    gator.task(expensive_cache_rebuild, user_id=request.user.pk)

This helps DRY up your code. It also helps you avoid having to change many files if you change backends or configuration.

Use Environment Variables or Settings for the Gator DSN

Instead of hard-coding the DSN for each Gator instance, you should rely on a configuration setting instead.

If you’re using plain old Python or subscribe to the Twelve-Factor App, you might lean on environment variables set in the shell. For instance, the Alligator test suite does:

import os

from alligator import Gator


# Lean on the ENV variable.
gator = Gator(os.environ['ALLIGATOR_CONN'])

Then when running your app, you could do the following in development, for ease of setup:

$ export ALLIGATOR_CONN=locmem://
$ python myapp.py

But the following on production, for handling large loads:

$ export ALLIGATOR_CONN=redis://some.dns.name.com:6379/0
$ python myapp.py

If you’re using something like Django, you could lean on settings instead, like:

from alligator import Gator

from django.conf import settings


# Lean on the settings variable.
gator = Gator(settings.ALLIGATOR_CONN)

And have differing settings files for development vs. production.

Use Environment Variables or Settings for Task.async

If you’re just using gator.task & trying to write tests, you may have a hard time verifying behavior in an integration test (though you should be able to just unit test the task function).

On the other hand, if you use the gator.options context manager & supply an async=False execution option, integration tests become easy, as the expense of possibly accidentally committing that & causing issues in production.

The best approach is to use the gator.options context manager, but use an environment variable/setting to control if things run asynchronously.

import os

# Using the above tip of a single import...
from myapp.gator import gator


def some_view(request):
    with gator.options(async=os.environ['ALLIGATOR_ASYNC']) as opts:
        opts.task(expensive_thing)

This allows you to set export ALLIGATOR_ASYNC=False in development/testing (so the task runs right away in-process) but queues appropriately in production.

Simple Task Parameters

When creating task functions, you want to simplify the arguments passed to it, as well as removing as many assumptions as possible.

You may be tempted to try to save queries by passing full objects or large lists of things as a parameter.

However, you must remember that the task may run at a very different time (perhaps hours in the future if you’re overloaded) or on a completely different machine than the one scheduling the task. Data goes stale easily & few things are as frustrating to debug as stale data being re-written over the top of new data.

Where possible, do the following things:

  • Pass primary keys or identifiers instead of rich objects
  • Persist large collections in the database or elsewhere, then pass a lookup identifier to the task
  • Use simple data types, as they serialize well & result in smaller queue payloads, meaning faster scheduling & consuming of tasks

Re-use the Gator.options Context Manager

All the examples in the Alligator docs show creating a single task within a gator.options(...) context manager. So you might be tempted to write code like:

with gator.options(retries=3) as opts:
    opts.task(send_mass_mail, list_id)

with gator.options(retries=3) as opts:
    opts.task(update_follow_counts, request.user.pk)

However, you can reuse that context manager to provide the same execution options to all tasks within the block. So we can clean up & shorten our code to:

with gator.options(retries=3) as opts:
    opts.task(send_mass_mail, list_id)
    opts.task(update_follow_counts, request.user.pk)

Two unique tasks will still be created, but both will have the retries=3 provided to better ensure they succeeed.