23 Aug 2015
I came across a blog post titled "PBKDF2: performance matters"
where the author discusses how most implementations of PBKDF2 are slower than
it otherwise could be.
After reading the blog post, I decided to write some Python bindings to
see how much of a performance increase I can obtain over the standard library's
hashlib.pbkdf2_hmac implementation. My goal is a library with an interface
that is compatible with
The results are surprisingly good. With a basic benchmarking script
on CPython 3.4.1, my implementation is about 3 times as fast as the standard
100 loops, best of 3: 60.2 msec per loop
100 loops, best of 3: 20.3 msec per loop
With PyPy 2.6.0, the results are even better.
100 loops, best of 3: 242 msec per loop
100 loops, best of 3: 19.2 msec per loop
I have since release my library as a PyPI package and the code is available on
Simply install the package with
and import the function
from fastpbkdf2 import pbkdf2_hmac
The interface is exactly the same as
hashlib.pbkdf2_hmac and should be a
29 Dec 2014
Cryptography libraries often have complicated APIs with many different options
to tweak. It is a goal PyCA's
cryptography library to provide safe and easy
to use APIs for common cryptographic tasks. To that end, the
package has a
Fernet recipe for symmetric encryption derived from the
original Ruby implementation and specification. However, the
Fernet recipe lacks the ability to authentiate (without encrypting) arbitrary
To make up for that use case not being covered by Fernet, I have written and
released on PyPI a library called
aead. It can be installed with
aead library is based on a IETF Internet Draft from David
McGrew. It is essentially
HMAC_SHA_256 composed with an
encrypt-then-mac construction. It relies on the
cryptography library for
the cryptographic primitives.
It has a simple to use API heavily inspired by the Fernet recipe in the
The module contains a single class that can be imported.
The class takes requires an encryption key to be initialized. The key has to be
32 bytes long and encoded with base64url as specified in RFC 4648. The library
classmethod to generate a suitable random key.
cryptor = AEAD(AEAD.generate_key())
After initializing the object, encryption can be done by calling the
.encrypt() method. The
.encrypt() method takes two paremeters, the first
being the data you want to encrypt and the second being associated data that
you want to authenticate but not encrypt. The second parameter is optional and
can be left out if there isn't any data to authenticate.
ct = cryptor.encrypt(b"Hello, World!", b"Additional Data")
.encrypt() returns base64url encoded cipher text.
Decrypting any data encrypted with
aead is similar. Simply call
in place of
.decrypt() method takes two parameters, the
first being the cipher text that needs decrypting and the second being the
associated data that was authenticated.
If the cipher text is corrupted or the associated data provided during the
decryption process does not match the associated data provided during
ValueError is raised, otherwise the decrypted plain text is
The repository for
aead can be found on GitHub and the
README.md file in the repository should be treated as the source of truth
if any information there differs from this blog post due to changes over time.
28 Sep 2014
For Python programmers, downloading Python packages from PyPI, the Python
Package Index, is second nature. Tools like
pip and conventions like the
requirements.txt file that most Python projects follow provides a consistent
way of specifying project dependencies.
However, installing random packages from PyPI is actually very dangerous,
a fact that not many people are aware of. There are a few factors that
contribute to this.
Python packages can execute arbitrary Python code during the installation
PyPI packages are not moderated. Unlike the package managers used in Linux
distros, anyone can register an account and upload Python packages without
going through a review process. While this is one factor contributing to
PyPI's success as a package repository, you will have to trust the maintainer
of the package that the package is safe.
As a proof of concept, I have written a
setup.py file that connects to a
Metasploit listener and downloads a Meterpreter shell during installation.
This demonstrates that it is trivial for someone to execute arbitrary code
on a machine through the installation of a Python package. You can obtain the
code from GitHub.
Run the Metasploit listener.
msf > use exploit/multi/handler
msf exploit(handler) > set payload python/meterpreter/reverse_tcp
msf exploit(handler) > set LHOST 127.0.0.1
Finally, run the setup.py file.
You should obtain a Meterpreter shell with the same privileges that you ran
setup.py script with.
While my example involves connecting to a Metasploit listener on
the same attack can be extended to install malware from remote systems or do
almost anything a Python script can do.
The problematic thing about this attack is that there are valid reasons for
Python packages to execute code during installation. This ranges from things
like OS version checks to compiling C code for packages that rely on C
setuptools to a subset of Python during installation
isn't exactly foolproof as demonstrated by the numerous Python sandbox escape
techniques. Moderating PyPI isn't a solution either as that will greatly
diminish PyPI's attractiveness as a package repository.
Here are two recommendations to limit the potential of such attacks.
NEVER install Python packages as root. This limits the privileges an
attacker has if the attack succeeds.
virtualenv is incredibly useful for
If you are in an organization with larger resources, audit the
third-party packages you depend on. Mirror trusted packages on an internal
devpi server instead of installing packages
directly from PyPI.
While I am limiting the details in this post to Python packages as that is the
ecosystem I am most familiar with, I believe that this issue also extends to
other languages and ecosystems such as Ruby and the gems ecosystem. While there
has been an increased focus over the years on paying attention to good security
practices when writing code, many forget about third-party code. This worries
me because third-party code represents such a large attack surface open to
exploits. As we all know, security is only as strong as the weakest link.
02 Aug 2014
A common scenario in web applications involve using a single password as a
means of authentication as well as a means to derive a secret for use in
Many strong key derivation functions like
properties that make them strong password hashing functions as well. However,
the same derived value cannot be used as an encryption key and a password hash.
The password hash value has to be stored by the server to compare against the
provided password in future authentication attempts. If this same value is
used as an encryption key, an attacker that compromises the server will be
able to decrypt the data easily.
There is an easy solution for this problem. While I will be using
pbkdf2 can be substituted for any strong algorithm like
Generate a random key
k using a cryptographically secure random number
generator. This means using
CryptGenRandom on Windows and
*nix operating systems. This random key
k will be used for encryption.
Generate two salts
s2 and store them in plaintext.
pbkdf2(password, s1) and store this value. This will be the
password hash you use to compare against for future authentication attempts.
pbkdf2(password, s2) xor k and store this value.
When the random key
k is required for encrypting or decrypting data,
xor the value of
pbkdf2(password, s2) against the value computed in
The advantage of this scheme is that the encryption key
k is not tied to the
password. This means that passwords can be changed without re-encrypting
the data with a new key repeating steps 1 - 4. A very useful property to have
in the event of a server compromise where passwords have to be reset en masse.