When you code a dynamic application, you will soon face its trade-off: it is dynamic.
Each time a user does a request, your server makes all sorts of calculations – database queries, template rendering and so on – to create the final response. For most web applications, this is not a big deal, but when your application starts becoming big and highly visited you will want to limit the overhead on your machines.
That's where caching comes in.
The main idea behind cache is simple: we store the result of an expensive calculation somewhere to avoid repeating the calculation if we can. But, sincerely speaking, designing a good caching scheme is mainly a PITA, since it involves many complex evaluations about what you should store, where to store it, and so on.
So how can weppy help you with this? It provides some tools out of the box that let you focus your development energy on what to cache and not on how you should do that.
The caching system in weppy consist of a single class named Cache
. Consequentially, the first step in configuring cache in your application is to create an instance of this cache in your application:
from weppy.cache import Cache
cache = Cache()
By default, weppy stores your cached content into the RAM of your machine, but you can also use the disk or redis as your storage system. Let's see these three handlers in detail.
As we just saw, this is the default cache mechanism of weppy. Initializing a Cache
instance without arguments would be the same of using the RamCache
handler:
from weppy.cache import Cache, RamCache
cache = Cache(ram=RamCache())
The RamCache
also accepts some parameters you might take advantage of:
parameter | default value | description |
---|---|---|
prefix | allows to specify a common prefix for caching keys | |
threshold | 500 | set a maximum number of objects stored in the cache |
default_expire | 300 | set a default expiration (in seconds) for stored objects |
Note on multi-processing: When you store data in RAM cache, you are actually using the python process' memory. If you're running your web application using multiple processes/workers, every process will have its own cache and the data you store wont be available to the other ones.
If you need to have a shared cache between processes, you should use the disk or redis ones.
The disk cache is actually slower than the RAM or the redis ones, but if you need to cache large amounts of data, it fits the role perfectly. Here is how to use it:
from weppy.cache import Cache, DiskCache
cache = Cache(disk=DiskCache())
The DiskCache
class accepts some parameters too:
parameter | default value | description |
---|---|---|
cache_dir | 'cache' |
allows to specify the directory in which data will be stored |
threshold | 500 | set a maximum number of objects stored in the cache |
default_expire | 300 | set a default expiration (in seconds) for stored objects |
Redis is quite a good system for caching: is really fast – really – and if you're running your application with several workers, your data will be shared between your processes. To use it, you just initialize the Cache
class with the RedisCache
handler:
from weppy.cache import Cache, RedisCache
cache = Cache(redis=RedisCache(host='localhost', port=6379))
As we saw with the other handlers, RedisCache
class accepts some parameters too:
parameter | default value | description |
---|---|---|
host | 'localhost' |
the host of the redis backend |
port | 6379 | the port of the redis backend |
db | 0 | the database number to use on the redis backend |
prefix | 'cache:' |
allows to specify a common prefix for caching keys |
default_expire | 300 | set a default expiration (in seconds) for stored objects |
As you probably supposed, you can use multiple caching system together. Let's say you want to use the three systems we just described. You can do it simply:
from weppy.cache import Cache, RamCache, DiskCache, RedisCache
cache = Cache(
ram=RamCache(),
disk=DiskCache(),
redis=RedisCache()
)
You can also tells to weppy what handler should be used when not specified, thanks to the default
parameter:
cache = Cache(m=RamCache(), r=RedisCache(), default='r')
The quickier usage of cache is to just apply it on a simple action, such as a select on the database or a computation. Let's say, for example, that you have a blog and a certain function that exposes the last ten posts:
@app.route("/last")
def last():
rows = Post.all().select(orderby=~Post.date, limitby=(0, 10))
return dict(posts=rows)
Now, since the performance bottleneck here is the call to the database, you can limit the overhead by caching the select result for 30 seconds, so you decrease the number of calls to your database:
@app.route("/last")
def last():
def _get():
return Post.all().select(orderby=~Post.date, limitby=(0, 10))
return dict(posts=cache('last_posts', _get, 30))
Here's how it works: you encapsulate the action you want to cache into a function, and then call your cache
instance with a key, the function, and the amount of time in seconds you want to store the result of your function. weppy will take care of the rest.
– OK, dude. What if I have multiple handlers? where does weppy store the result?
– you can choose that
As we saw before, by default weppy stores your cached content into the handler chosen as default. But you can choose on which handler you want to store data:
cache = Cache(
ram=RamCache(),
disk=DiskCache(),
redis=RedisCache(),
default='ram'
)
v_ram = cache('my_key', my_f, my_time)
v_ram = cache.ram('my_key', my_f, my_time)
v_disk = cache.disk('my_key', my_f, my_time)
v_redis = cache.redis('my_key', my_f, my_time)
New in version 1.2
weppy's cache can also be used as a decorator. For example, we can rewrite the above example as follows:
@cache(duration=30)
def last_posts():
return Post.all().select(orderby=~Post.date, limitby=(0, 10))
@app.route("/last")
def last():
return dict(posts=last_posts())
and the result would be the same. The notation, in the case you want to specify the handler to use, is the same:
# use redis handler
@cache.redis()
# use ram handler
@cache.ram()
When using the decorator notation, weppy will use the arguments you pass to the decorated method to build different results. This means that if we decorate a method that accepts arguments like:
@cache()
def cached_method(a, b, c='foo', d='bar'):
# some code
then weppy will cache different contents in case you call cached_method(1, 2, c='a')
and cached_method(1, 3, c='b')
.
New in version 1.2
Sometimes you would need to cache an entire response from your application. weppy provides the Cache.response
decorator for that. Let's rewrite the example we used above: this time, instead of caching just the database selection, we will cache the entire page that weppy will produce from our route:
@app.route("/last")
@cache.response()
def last():
posts = Post.all().select(orderby=~Post.date, limitby=(0, 10))
return dict(posts=posts)
The main difference from the above examples is that, in case of available cached content, everything that would happened inside your route and template code won't be executed; instead, weppy will return the final response body and its headers from the ones available in the cache.
Note: this means that also nothing contained in the
pipe
,on_pipe_success
andon_pipe_failure
methods of the pipes in your route pipeline won't be executed. In case you need execution of code on cached routes you should use theopen
andclose
methods of the pipes.
Mind that weppy will cache only contents on GET and HEAD requests that returns a 200 response code. This is intended to avoid unwanted cached mechanism on your application.
The Cache.response
method accepts also some parameters you might want to use:
parameter | default value | description |
---|---|---|
duration | 'default' |
the duration (in seconds) the cached content should be considered valid |
query_params | True |
tells weppy to consider the request's query parameters to generate different cached contents |
language | True |
tells weppy to consider the clients language to generate different cached contents |
hostname | False |
tells weppy to consider the path hostname to generate different cached contents |
headers | [] |
an additional list of headers weppy should use to generate different cached contents |
In some cases, you might need to cache all the routes contained in an application module. In order to achieve this, you can use the cache
parameter when you define your module:
mod = app.module(__name__, 'mymodule', cache=cache.response())
Changed in version 1.2
As we saw in the sections above, the common usage of cache is to call the Cache
instance with a callable object that will produce the cached contents in case they are not available in the cache.
In all the cases you need to perform operations on the cache dirrently, you can use the exposed methods of the Cache
instance and its handlers. Let's see them in detail.
Every time you need to access contents from cache, you can use the get
method:
value = cache.get('key')
If no contents are available, this method will return None
.
When you need to manually set contents in cache, you can use the set
method:
cache.set('key', 'value', duration=300)
Note: if you want to store the result of a callable object, you should invoke it yourself.
You can implement a manual check-and-set policy using get
and set
methods:
value = cache.get('key')
if not value:
value = 'somevalue'
cache.set('key', value, duration=300)
The last example can be written in a compact way using the get_or_set
method:
value = cache.get_or_set('key', 'somevalue', duration=300)
Note: as we saw for the
set
method, if you want to store the result of a callable object, you should invoke it yourself.
Whenever you need to manually delete contents from cache, you can use the clear
method:
cache.clear('key')
And if you need to clear the entire cache you can invoke the clear method without arguments.
Note: on redis, a key containing * will mean clearing all the existing keys with that pattern. So calling
cache.clear('user*')
will delete all the contents for keys starting with user.