In computer science, in the context of data storage, serialization (or serialisation) is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer) or transmitted (for example, across a network connection link) and reconstructed later (possibly in a different computer environment).
- Wikipedia
Taking data objects and storing them on disk or preparing for transport over the network is very common in programming. This process is called serialization and while there are many ways to do this in Python, Pickle is easy and included as a default module.
As Pickle is included by default in both Python 2 and Python 3, you will not need to install any modules. You will need to import Pickle and then you can use it to load and save data.
The following data types can be pickled:
Source: Pickle documentation
Importing Pickle is super easy, barely an inconvenience.
import pickle
Saving data with Pickle is as simple as calling dump() with the object to dump and the file handle as parameters.
my_object = [1, 2, 3, 4, 5]
file = open('mydata', 'ab')
pickle.dump(my_object, file)
file.close()
You can serialize most any object but you must use binary mode for open().
Loading data with Pickle is very similar to saving data, there is only one parameter (most of the time) and load() returns the object being deserialized.
file = open('mydata', 'rb')
my_object = pickle.load(file)
file.close()
As with dump(), you need to ensure the file handle is opened in binary mode.
Pickle can serialize code as well as data, this means you need to be careful loading pickled data from unknown sources as you are potentially running untrusted code.
Because of this, you should only use Pickle for data you trust and not for operating with unknown data. You can also sign pickled data with hmac to verify if the data has been manipulated.
There is also the EncryptedPickle module for using Encryption with Pickle.