Usage notes¶
Persistence¶
By default, a Redis key is generated when you create a new collection:
>>> from redis_collections import Dict
>>> D = Dict()
>>> D['answer'] = 42
>>> D.key
fe267c1dde5d4f648e7bac836a0168fe
If you specify a key when creating a collection you can retrieve what was stored there previously:
>>> E = Dict(key='fe267c1dde5d4f648e7bac836a0168fe')
>>> E['answer']
42
This should even work across processes, meaning if your Python script terminates, you can retrieve its data again from Redis.
Each collection allows you to delete its Redis key with the clear method:
>>> D.clear()
>>> list(D.items())
Note
Stored objects are serialized with Python-standard pickling.
By default, the highest protocol version
is used.
It’s not recommended to retrieve objects created by one version of Python
with another version.
If you attempt to do that, be sure to set the pickle_protocol
keyword
argument to a version that both Python versions support when
declaring a collection.
Redis connection¶
By default, collections create a new Redis connection when they are
instantiated. This requires no configuration, but is inefficient if you are
using multiple collections. To share a connection among multiple collections,
create one (with redis.StrictRedis
) and pass it using the redis
keyword when creating the collections.
>>> from redis import StrictRedis
>>> conn = StrictRedis()
>>> D = Dict(redis=conn)
>>> L = List(redis=conn)
A collection’s copy
method creates new a instance that uses the same Redis
connection as the original object:
>>> conn = StrictRedis()
>>> list_01 = List([1, 2], redis=conn)
>>> list_02 = list_01.copy() # result is using the same connection
Operations on two collections backed by different Redis servers will be performed in Python:
>>> list_1 = List((1, 2, 3), redis=StrictRedis(port=6379))
>>> list_2 = List((4, 5, 6), redis=StrictRedis(port=6380))
>>> list_1.extend(list_2)
Synchronization¶
Storing a mutable object like a list
in a Dict
or a List
can lead
to surprising behavior. Because of Python semantics, it’s impossible to
automatically write to Redis when such an object is retrieved and modified.
>>> D = Dict({'key': [1, 2]}) # Store a mutable object
>>> D['key'].append(3) # Retrieve and modify the object
>>> D['key'] # Retrieve the object from Redis again
[1, 2]
If you plan to work with mutable objects, be sure to specify writeback=True
when instantiating your collection. This will keep a local cache that is
flushed to Redis when the sync
method is called:
>>> D = Dict({'key': [1, 2]}, writeback=True)
>>> D['key'].append(3)
>>> D['key'] # Modifications are retrieved from the cache
[1, 2, 3]
>>> D.sync() # Flush cache to Redis
You may also use a with
block to automatically call the sync
method.
>>> with Dict({'key': [1, 2]}) as D:
... D['key'].append(3)
>>> D['key'] # Changes were automatically synced
[1, 2, 3]
The writeback
option is automatically enabled for DefaultDict
objects.
Hashing dictionary keys and set elements¶
Python takes care
to make sure that equal numeric values, such as 1.0
and 1
, have the
same hash value. If you add 1.0
to a set
or a dict
, you will not be
able to add 1
, as an equal value is already stored.
The Redis-backed Dict
and Set
classes in this library attempt to follow
this behavior, but there are some differences. For the built-in Python
collections, you get back the first value you stored:
>>> python_dict = {}
>>> python_dict[1.0] = 'one' # 1.0 stored first
>>> python_dict[1] = 'One' # 1 stored second
>>> list(python_dict.keys()) # 1.0 is retrieved
[1.0]
For the Redis-backed collections, you’ll get back the integer:
>>> redis_dict = Dict()
>>> redis_dict[1.0] = 'one' # 1.0 stored first
>>> redis_dict[1] = 'One' # 1 stored second
>>> list(redis_dict.keys()) # 1 is retrieved
[1]
This behavior applies to complex
, float
, Decimal
, and Fraction
values that have an integer equivalent. It doesn’t apply to values that don’t
have an integer equivalent (such as 1.1
or complex(1, 1)
).
Security considerations¶
Collections use pickle
, which means you should never retrieve data from
a source you don’t trust.
For example: suppose you maintain a web application that has user profiles.
Users can submit their name, birthday, and a brief biography; and ultimately
this is information stored in a Redis hash. Do not attach a
redis_collection.Dict
instance to that hash key - a user could construct
a string that gives them the ability to execute arbitrary code with your Python
process’s privileges.
Subclass customization¶
Collections use uuid.uuid4()
for generating unique keys.
If you are not satisfied with that function’s
collision probability you may
sublclass a collection and override its _create_key()
method.
If you don’t like how pickle
does serialization, you may override the
_pickle*
and _unpickle*
methods on the collection classes.
Using other serializers will limit the objects you can store or retrieve.