Code Monkey home page Code Monkey logo

phoenix-ecto-encryption-example's Introduction

Phoenix Ecto Encryption Example

data-encrypted-cropped

Build Status codecov.io HitCount

Why?

Encrypting User/Personal data stored by your Web App is essential for security/privacy.

If your app offers any personalised content or interaction that depends on "login", it is storing personal data (by definition). You might be tempted to think that data is "safe" in a database, but it's not. There is an entire ("dark") army/industry of people (cybercriminals) who target websites/apps attempting to "steal" data by compromising databases. All the time you spend building your app, they spend trying to "break" apps like yours.
Don't let the people using your app be the victims of identity theft, protect their personal data! (it's both the "right" thing to do and the law ...)

What?

This example/tutorial is intended as a comprehensive answer to the question:

"How to Encrypt/Decrypt Sensitive Data in Elixir Apps Before Inserting (Saving) it into the Database?"

Technical Overview

We are not "re-inventing encryption" or using our "own algorithm" everyone knows that's a "bad idea": https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own
We are following a battle-tested industry-standard approach and applying it to our Elixir/Phoenix App.
We are using:

¯\_(ツ)_/¯...? Don't be "put off" if any of these terms/algorithms are unfamiliar to you;
this example is "step-by-step" and we are happy to answer/clarify any (relevant and specific) questions you have!

OWASP Cryptographic Rules?

This example/tutorial follows the Open Web Application Security Project (OWASP) Cryptographic and Password rules:

  • Use "strong approved Authenticated Encryption" based on an AES algorithm.
    • Use GCM mode of operation for symmetric key cryptographic block ciphers.
    • Keys used for encryption must be rotated at least annually.
  • Only use approved public algorithm SHA-256 or better for hashing.
  • Argon2 is the winner of the password hashing competition and should be your first choice for new applications.

See:

Who?

This example/tutorial is for any developer (or technical decision maker / "application architect")
who takes personal data protection seriously and wants a robust/reliable and "transparent" way
of encrypting data before storing it, and decrypting when it is queried.

Prerequisites?

If you are totally new to (or "rusty" on) Elixir, Phoenix or Ecto, we recommend going through our Phoenix Chat Example (Beginner's Tutorial) first: https://github.com/dwyl/phoenix-chat-example

Crypto Knowledge?

You will not need any "advanced" mathematical knowledge; we are not "inventing" our own encryption or going into the "internals" of any cyphers/algorithms/schemes.

You do not need to understand how the encryption/hashing algorithms work,
but it is useful to know the difference between encryption vs. hashing and plaintext vs. ciphertext.

The fact that the example/tutorial follows all OWASP crypto/hashing rules (see: "OWASP Cryptographic Rules?" section above), should be "enough" for most people who just want to focus on building their app and don't want to "go down the rabbit hole".

However ... We have included 30+ links in the "Useful Links" section at the end of this readme. The list includes several common questions (and answers) so if you are curious, you can learn.

Note: in the @dwyl Library we have https://www.schneier.com/books/applied_cryptography So, if you're really curious let us know!

Time Requirement?

Simply reading ("skimming") through this example will only take 15 minutes.
Following the examples on your computer (to fully understand it) will take around 1 hour
(including reading a few of the links).

Invest the time up-front to avoid on the embarrassment and fines of a data breach.

How?

These are "step-by-step" instructions, don't skip any step(s).

1. Create the encryption App

In your Terminal, create a new Phoenix application called "encryption":

mix phx.new encryption

When you see Fetch and install dependencies? [Yn],
type y and press the [Enter] key to download and install the dependencies.
You should see following in your terminal:

* running mix deps.get
* running mix deps.compile
* running cd assets && npm install && node node_modules/webpack/bin/webpack.js --mode development

We are almost there! The following steps are missing:

    $ cd encryption

Then configure your database in config/dev.exs and run:

    $ mix ecto.create

Start your Phoenix app with:

    $ mix phx.server

You can also run your app inside IEx (Interactive Elixir) as:

    $ iex -S mix phx.server

Follow the first instruction change into the encryption directory:

cd encryption

Next create the database for the App using the command:

mix ecto.create

You should see the following output:

Compiling 13 files (.ex)
Generated encryption app
The database for Encryption.Repo has been created

2. Create the user Schema (Database Table)

In our example user database table, we are going to store 3 (primary) pieces of data.

  • name: the person's name (encrypted)
  • email: their email address (encrypted)
  • password_hash: the hashed password (so the person can login)

In addition to the 3 "primary" fields, we need two more fields to store "metadata":

  • email_hash: so we can check ("lookup") if an email address is in the database without having to decrypt the email(s) stored in the DB.
  • key_id: the id of the encryption key used to encrypt the data stored in the row. As this is an id we use an :integer to store it in the DB.1

Create the user schema using the following generator command:

mix phx.gen.schema User users email:binary email_hash:binary name:binary password_hash:binary key_id:integer

phx.gen.schema

The reason we are creating the encrypted/hashed fields as :binary is that the data stored in them will be encrypted and :binary is the most efficient Ecto/SQL data type for storing encrypted data; storing it as a String would take up more bytes for the same data. i.e. wasteful without any benefit to security or performance.
see: https://dba.stackexchange.com/questions/56934/what-is-the-best-way-to-store-a-lot-of-user-encrypted-data
and: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html

Next we need to update our newly created migration file. Open priv/repo/migrations/{timestamp}_create_users.exs.

Your migration file will have a slightly different name to ours as migration files are named with a timestamp when they are created but it will be in the same location.

Update the file from:

defmodule Encryption.Repo.Migrations.CreateUsers do
  use Ecto.Migration

  def change do
    create table(:users) do
      add(:email, :binary)
      add(:email_hash, :binary)
      add(:name, :binary)
      add(:password_hash, :binary)
      add(:key_id, :integer)

      timestamps()
    end
  end
end

To

defmodule Encryption.Repo.Migrations.CreateUsers do
  use Ecto.Migration

  def change do
    create table(:users) do
      add(:email, :binary)
      add(:email_hash, :binary)
      add(:name, :binary)
      add(:password_hash, :binary)
      add(:key_id, :integer)

      timestamps()
    end

    create(unique_index(:users, [:email_hash]))
  end
end

The newly added line ensures that we will never be allowed to enter duplicate email_hash values into our database.

Run the "migration" task to create the tables in the Database:

mix ecto.migrate

Running the mix ecto.migrate command will create the users table in your encryption_dev database.
You can view this (empty) table in a PostgreSQL GUI. Here is a screenshot from pgAdmin:
elixir-encryption-pgadmin-user-table

1 key_id: The key_id column in our users database table, indicates which encryption key was used to encrypt the data. For this example/demo we are using two encryption keys to demonstrate key rotation. Key rotation is a "best practice" that limits the amount of data an "attacker" can decrypt if the database were ever "compromised" (provided we keep the encryption keys safe that is!)

3. Define The 6 Functions

We need 6 functions for encrypting, decrypting, hashing and verifying the data we will be storing:

  1. Encrypt - to encrypt any personal data we want to store in the database.
  2. Decrypt - decrypt any data that needs to be viewed.
  3. Get Key - get the latest encryption/decryption key (or a specific older key where data was encrypted with a different key)
  4. Hash Email (deterministic & fast) - so that we can "lookup" an email without "decrypting". The hash of an email address should always be the same.
  5. Hash Password (pseudorandom & slow) - the output of the hash should always be different and relatively slow to compute.
  6. Verify Password - check a password against the stored password_hash to confirm that the person "logging-in" has the correct password.

The next 6 sections of the example/tutorial will walk through the creation of (and testing) these functions.

Note: If you have any questions on these functions, please ask:
github.com/dwyl/phoenix-ecto-encryption-example/issues

3.1 Encrypt

Create a file called lib/encryption/aes.ex and copy-paste (or hand-write) the following code:

defmodule Encryption.AES do
  @aad "AES256GCM" # Use AES 256 Bit Keys for Encryption.

  def encrypt(plaintext) do
    iv = :crypto.strong_rand_bytes(16) # create random Initialisation Vector
    key = get_key()    # get the *latest* key in the list of encryption keys
    {ciphertext, tag} =
      :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, to_string(plaintext), 16})
    iv <> tag <> ciphertext # "return" iv with the cipher tag & ciphertext
  end

  defp get_key do # this is a "dummy function" we will update it in step 3.3
    <<109, 182, 30, 109, 203, 207, 35, 144, 228, 164, 106, 244, 38, 242,
    106, 19, 58, 59, 238, 69, 2, 20, 34, 252, 122, 232, 110, 145, 54,
    241, 65, 16>> # return a random 32 Byte / 128 bit binary to use as key.
  end
end

The encrypt/1 function for encrypting plaintext into ciphertext is quite simple; (the "body" is only 4 lines).

Let's "step through" these lines one at a time:

Having different ciphertext each time plaintext is encrypted is essential for "semantic security" whereby repeated use of the same encryption key and algorithm does not allow an "attacker" to infer relationships between segments of the encrypted message. Cryptanalysis techniques are well "beyond scope" for this example/tutorial, but we highly encourage to check-out the "Background Reading" links at the end and read up on the subject for deeper understanding.

  • Next we use the get_key/0 function to retrieve the latest encryption key so we can use it to encrypt the plaintext (the "real" get_key/0 is defined below in section 3.3).

  • Then we use the Erlang block_encrypt function to encrypt the plaintext.
    Using :aes_gcm ("Advanced Encryption Standard Galois Counter Mode"):

    • @aad is a "module attribute" (Elixir's equivalent of a "constant") is defined in aes.ex as @aad "AES256GCM"
      this simply defines the encryption mode we are using which, if you break down the code into 3 parts:
      • AES = Advanced Encryption Standard.
      • 256 = "256 Bit Key"
      • GCM = "Galois Counter Mode"
  • Finally we "return" the iv with the ciphertag & ciphertext, this is what we store in the database. Including the IV and ciphertag is essential for allowing decryption, without these two pieces of data, we would not be able to "reverse" the process.

Note: in addition to this encrypt/1 function, we have defined an encrypt/2 "sister" function which accepts a specific (encryption) key_id so that we can use the desired encryption key for encrypting a block of text. For the purposes of this example/tutorial, it's not strictly necessary, but it is included for "completeness".

Test the encrypt/1 Function

Create a file called test/lib/aes_test.exs and copy-paste the following code into it:

defmodule Encryption.AESTest do
  use ExUnit.Case
  alias Encryption.AES

  test ".encrypt includes the random IV in the value" do
    <<iv::binary-16, ciphertext::binary>> = AES.encrypt("hello")

    assert String.length(iv) != 0
    assert String.length(ciphertext) != 0
    assert is_binary(ciphertext)
  end

  test ".encrypt does not produce the same ciphertext twice" do
    assert AES.encrypt("hello") != AES.encrypt("hello")
  end
end

Run these two tests by running the following command:

mix test test/lib/aes_test.exs

The full function definitions for AES encrypt/1 & encrypt/2 are in: lib/encryption/aes.ex
And tests are in: test/lib/aes_test.exs

3.2 Decrypt

The decrypt function reverses the work done by ecrypt; it accepts a "blob" of ciphertext (which as you may recall), has the IV and cypher tag prepended to it, and returns the original plaintext.

In the lib/encryption/aes.ex file, copy-paste (or hand-write) the following decrypt/1 function definition:

def decrypt(ciphertext) do
  <<iv::binary-16, tag::binary-16, ciphertext::binary>> = ciphertext
  :crypto.block_decrypt(:aes_gcm, get_key(), iv, {@aad, ciphertext, tag})
end

The fist step (line) is to "split" the IV from the ciphertext using Elixir's binary pattern matching.

If you are unfamiliar with Elixir binary pattern matching syntax: <<iv::binary-16, tag::binary-16, ciphertext::binary>> read the following guide: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html

The :crypto.block_decrypt(:aes_gcm, get_key(), iv, {@aad, ciphertext, tag}) line is the very similar to the encrypt function.

The ciphertext is decrypted using block_decrypt/4 passing in the following parameters:

  • :aes_gcm = encyrption algorithm
  • get_key() = get the encryption key used to encrypt the plaintext
  • iv = the original Initialisation Vector used to encrypt the plaintext
  • {@aad, ciphertext, tag} = a Tuple with the encryption "mode", ciphertext and the tag that was originally used to encrypt the ciphertext.

Finally return just the original plaintext.

Note: as above with the encrypt/2 function, we have defined an decrypt/2 "sister" function which accepts a specific (encryption) key_id so that we can use the desired encryption key for decrypting the ciphertext. For the purposes of this example/tutorial, it's not strictly necessary, but it is included for "completeness".

Test the decrypt/1 Function

In the test/lib/aes_test.exs add the following test:

test "decrypt/1 ciphertext that was encrypted with default key" do
  plaintext = "hello" |> AES.encrypt |> AES.decrypt()
  assert plaintext == "hello"
end

Re-run the tests mix test test/lib/aes_test.exs and confirm they pass.

The full encrypt & decrypt function definitions with @doc comments are in: lib/encryption/aes.ex


> And tests are in: [`test/lib/aes_test.exs`](https://github.com/dwyl/phoenix-ecto-encryption-example/blob/master/test/lib/aes_test.exs)

3.3 Get (Encryption) Key

You will have noticed that both encrypt and decrypt functions call a get_key/0 function.
We wrote a "dummy" function in Step 3.1, we need to define the "real" function now!

defp get_key do
  keys = Application.get_env(:encryption, Encryption.AES)[:keys]
  count = Enum.count(keys) - 1 # get the last/latest key from the key list
  get_key(count) # use get_key/1 to retrieve the desired encryption key.
end

defp get_key(key_id) do
  keys = Application.get_env(:encryption, Encryption.AES)[:keys] # cached call
  Enum.at(keys, key_id) # retrieve the desired key by key_id from keys list.
end

We define the get_key twice in lib/encryption/aes.ex as per Erlang/Elixir standard, once for each "arity" or number of "arguments". In the first case get_key/0 assumes you want the latest Encryption Key. The second case get_key/1 lets you supply the key_id to be "looked up":

Both versions of get_key call the Application.get_env function: Application.get_env(:encryption, Encryption.AES)[:keys] specifically. For this to work we need to define the keys as an Environment Variable and make it available to our App in config.exs.

For the complete file containing these functions see: lib/encryption/aes.ex

ENCRYPTION_KEYS Environment Variable

In order for our get_key/0 and get_key/1 functions to work, it needs to be able to "read" the encryption keys.

We need to "export" an Environment Variable containing a (comma-separated) list of (one or more) encryption key(s).

Copy-paste (and run) the following command in your terminal:

echo "export ENCRYPTION_KEYS='nMdayQpR0aoasLaq1g94FLba+A+wB44JLko47sVQXMg=,L+ZVX8iheoqgqb22mUpATmMDsvVGtafoAeb0KN5uWf0='" >> .env && echo ".env" >> .gitignore

For now, copy paste this command exactly as it is.
When you are deploying your own App, generate your own AES encryption key(s) see: How To Generate AES Encryption Keys? section below for how to do this.

Note: there are two encryption keys separated by a comma. This is to demonstrate that it's possible to use multiple keys.

We prefer to store our Encryption Keys as Environment Variables this is consistent with the "12 Factor App" best practice: https://en.wikipedia.org/wiki/Twelve-Factor_App_methodology

Update the config/config.exs to load the environment variables from the .env file into the application. Add the following code your config file just above import_config "#{Mix.env()}.exs":

# run shell command to "source .env" to load the environment variables.
try do                                     # wrap in "try do"
  File.stream!("./.env")                   # in case .env file does not exist.
    |> Stream.map(&String.trim_trailing/1) # remove excess whitespace
    |> Enum.each(fn line -> line           # loop through each line
      |> String.replace("export ", "")     # remove "export" from line
      |> String.split("=", parts: 2)       # split on *first* "=" (equals sign)
      |> Enum.reduce(fn(value, key) ->     # stackoverflow.com/q/33055834/1148249
        System.put_env(key, value)         # set each environment variable
      end)
    end)
rescue
  _ -> IO.puts "no .env file found!"
end

# Set the Encryption Keys as an "Application Variable" accessible in aes.ex
config :encryption, Encryption.AES,
  keys: System.get_env("ENCRYPTION_KEYS") # get the ENCRYPTION_KEYS env variable
    |> String.replace("'", "")  # remove single-quotes around key list in .env
    |> String.split(",")        # split the CSV list of keys
    |> Enum.map(fn key -> :base64.decode(key) end) # decode the key.
Test the get_key/0 and get_key/1 Functions?

Given that get_key/0 and get_key/1 are both defp (i.e. "private") they are not "exported" with the AES module and therefore cannot be invoked outside of the AES module.

The get_key/0 and get_key/1 are invoked by encrypt/2 and decrypt/2 and thus provided these (public) latter functions are tested adequately, the "private" functions will be too.

Re-run the tests mix test test/lib/aes_test.exs and confirm they still pass.

The full encrypt & decrypt function definitions with @doc comments are in: lib/encryption/aes.ex


> And tests are in: [`test/lib/aes_test.exs`](https://github.com/dwyl/phoenix-ecto-encryption-example/blob/master/test/lib/aes_test.exs)

3.4 Hash Email Address

The idea behind hashing email addresses is to allow us to perform a lookup (in the database) to check if the email has already been registered/used for app/system.

Imagine that [email protected] has previously used your app. The SHA256 hash (encoded as base64) is: "bbYebcvPI5DkpGr0JvJqEzo77kUCFCL8euhukTbxQRA="

try it for yourself in iex:

iex(1)> email = "[email protected]"
"[email protected]"
iex(2)> email_hash = :crypto.hash(:sha256, email) |> Base.encode64
"bbYebcvPI5DkpGr0JvJqEzo77kUCFCL8euhukTbxQRA="

If we store the email_hash in the database, when Alex wants to log-in to the App/System, we simply perform a "lookup" in the users table:

hash  = :crypto.hash(:sha256, email) |> Base.encode64
query = "SELECT * FROM users WHERE email_hash = $1"
user  = Ecto.Adapters.SQL.query!(Encryption.Repo, query, [hash])

Note: there's a "built-in" Ecto get_by function to perform this type of
"SELECT ... WHERE field = value" query effortlessly

Generate the SECRET_KEY_BASE

All Phoenix apps have a secret_key_base for sessions. see: http://phoenixframework.org/blog/sessions

Run the following command to generate a new phoenix secret key:

mix phx.gen.secret

copy-paste the output (64bit String) into your .env file after the "equals sign" on the line for SECRET_KEY_BASE:

export SECRET_KEY_BASE={YourSecreteKeyBaseGeneratedUsing-mix_phx.gen.secret}

Your .env file should look similar to: .env_sample

Note: We are using an .env file, but if you are using a "Cloud Platform" to deploy your app,
you could consider using their "Key Management Service" for managing encryption keys. eg
:

We now need to update our config files again. Open your config.exs file and change the the following: from

  secret_key_base: "3PXN/6k6qoxqQjWFskGew4r74yp7oJ1UNF6wjvJSHjC5Y5LLIrDpWxrJ84UBphJn",
  # your secret_key_base will be different but that is fine.

To

  secret_key_base: System.get_env("SECRET_KEY_BASE"),

As mentioned above, all Phoenix applications come with a secret_key_base. Instead of using this default one, we have told our application to use the new one that we added to our .env file.

Now we need to edit our config/test.exs file. Change the following: from

config :encryption, EncryptionWeb.Endpoint,
  http: [port: 4001],
  server: false

To

config :encryption, EncryptionWeb.Endpoint,
  http: [port: 4001],
  server: false,
  secret_key_base: "3PXN/6k6qoxqQjWFskGew4r74yp7oJ1UNF6wjvJSHjC5Y5LLIrDpWxrJ84UBphJn"

By adding the previous code block we will now have a secret_key_base which we will be able to use for testing.

Next, create a file called lib/encryption/hash_field.ex and include the following code:

defmodule Encryption.HashField do

  def hash(value) do
    :crypto.hash(:sha256, value <> get_salt(value))
  end

  # Get/use Phoenix secret_key_base as "salt" for one-way hashing Email address
  # use the *value* to create a *unique* "salt" for each value that is hashed:
  defp get_salt(value) do
    secret_key_base =
      Application.get_env(:encryption, EncryptionWeb.Endpoint)[:secret_key_base]
    :crypto.hash(:sha256, value <> secret_key_base)
  end
end

The hash/1 function use Erlang's crypto library hash/2 function.

  • First we tell the hash/2 function that we want to use :sha256 "SHA 256" is the most widely used/recommended hash; it's both fast and "secure".
  • We then hash the value passed in to the hash/1 function (we defined) and concatenate it with "salt" using the get_salt/1 function which retrieves the secret_key_base environment variable and computes a unique "salt" using the value.

We use the SHA256 one-way hash for speed. We "salt" the email address so that the hash has some level of "obfuscation", in case the DB is ever "compromised" the "attacker" still has to "compute" a "rainbow table" from scratch.

Note: Don't forget to export your SECRET_KEY_BASE environment variable (see instructions above)

The full file containing these two functions is: lib/encryption/hash_field.ex
And the tests for the functions are: test/lib/hash_field_test.exs

3.5 Hash Password

When hashing passwords, we want to use the strongest hashing algorithm and we also want the hashed value (or "digest") to be different each time the same plaintext is hashed (unlike when hashing the email address where we want a deterministic digest).

Using argon2 makes "cracking" a password (in the event of the database being "compromised" far less likely) as is has both a CPU-bound "work-factor" and a "Memory-hard" algorithm which will significantly "slow down" the attacker.

Add the argon2 Dependency

In order to use argon2 we must add it to our mix.exs file: in the defp deps do (dependencies) section, add the following line:

{:argon2_elixir, "~> 1.3"},  # securely hashing & verifying passwords

You will need to run mix deps.get to install the dependency.

Define the hash_password/1 Function

Create a file called lib/encryption/password_field.ex in your project. The first function we need is hash_password/1:

defmodule Encryption.PasswordField do

  def hash_password(value) do
    Argon2.Base.hash_password(to_string(value),
      Argon2.gen_salt(), [{:argon2_type, 2}])
  end

end

hash_password/1 accepts a password to be hashed and invokes Argon2.Base.hash_password/3 passing in 3 arguments:

3.5.1 Test the hash_password/1 Function?

In order to test the PasswordField.hash_password/1 function we use the Argon2.verify_pass function to verify a password hash.

Create a file called test/lib/password_field_test.exs and copy-paste (or hand-type) the following test:

defmodule Encryption.PasswordFieldTest do
  use ExUnit.Case
  alias Encryption.PasswordField, as: Field

  test "hash_password/1 uses Argon2id to Hash a value" do
    password = "EverythingisAwesome"
    hash = Field.hash_password(password)
    verified = Argon2.verify_pass(password, hash)
    assert verified
  end

end

Run the test using the command:

mix test test/lib/password_field_test.exs

The test should pass; if not, please re-trace the steps.

3.6 Verify Password

The corresponding function to check (or "verify") the password is verify_password/2. We need to supply both the password and stored_hash (the hash that was previously stored in the database when the person registered or updated their password) It then runs Argon2.verify_pass which does the checking.

def verify_password(password, stored_hash) do
  Argon2.verify_pass(password, stored_hash)
end

hash_password/1 and verify_password/2 functions are defined in: lib/encryption/password_field.ex

Test for verify_password/2

To test that our verify_password/2 function works as expected, open the file: test/lib/password_field_test.exs
and add the following code to it:

test "verify_password checks the password against the Argon2id Hash" do
  password = "EverythingisAwesome"
  hash = Field.hash_password(password)
  verified = Field.verify_password(password, hash)
  assert verified
end

test ".verify_password fails if password does NOT match hash" do
  password = "EverythingisAwesome"
  hash = Field.hash_password(password)
  verified = Field.verify_password("LordBusiness", hash)
  assert !verified
end

Run the tests: mix test test/lib/password_field_test.exs and confirm they pass.

If you get stuck, see: test/lib/password_field_test.exs

4. Create EncryptedField Custom Ecto Type

Writing a few functions to encrypt, decrypt and hash data is a good start,
however the real "magic" comes from defining these functions as Custom Ecto Types.

When we first created the Ecto Schema for our "user", in Step 2 (above) This created the lib/encryption/user.ex file with the following schema:

schema "users" do
  field :email, :binary
  field :email_hash, :binary
  field :key_id, :integer
  field :name, :binary
  field :password_hash, :binary

  timestamps()
end

The default Ecto field types (:binary) are a good start. But we can do so much better if we define custom Ecto Types!

Ecto Custom Types are a way of automatically "pre-processing" data before inserting it into (and reading from) a database. Examples of "pre-processing" include:

  • Custom Validation e.g: phone number or address format.
  • Encrypting / Decrypting
  • Hashing

A custom type expects 4 "callback" functions to be implemented in the file:

  • type/0 - define the Ecto Type we want Ecto to use to store the data for our Custom Type. e.g: :integer or :binary
  • cast/1 - "typecasts" (converts) the given data to the desired type e.g: Integer to String.
  • dump/1 - performs the "processing" on the raw data before it get's "dumped" into the Ecto Native Type.
  • load/1 - called when loading data from the database and receive an Ecto native type.

Create a file called lib/encryption/encrypted_field.ex and add the following:

defmodule Encryption.EncryptedField do
  alias Encryption.AES  # alias our AES encrypt & decrypt functions (3.1 & 3.2)

  @behaviour Ecto.Type  # Check this module conforms to Ecto.type behavior.
  def type, do: :binary # :binary is the data type ecto uses internally

  # cast/1 simply calls to_string on the value and returns a "success" tuple
  def cast(value) do
    {:ok, to_string(value)}
  end

  # dump/1 is called when the field value is about to be written to the database
  def dump(value) do
    ciphertext = value |> to_string |> AES.encrypt
    {:ok, ciphertext} # ciphertext is :binary data
  end

  # load/1 is called when the field is loaded from the database
  def load(value) do
    {:ok, AES.decrypt(value)} # decrypted data is :string type.
  end

  # load/2 is called with a specific key_id when the field is loaded from DB
  def load(value, key_id) do
    {:ok, AES.decrypt(value, key_id)}
  end
end

Let's step through each of these

type/0

The best data type for storing encrypted data is :binary (it uses half the "space" of a :string for the same ciphertext).

cast/1

Cast any data type to_string before encrypting it. (the encrypted data "ciphertext" will be of :binary type)

dump/1

Calls the AES.encrypt/1 function we defined in section 3.1 (above) so data is encrypted before we insert into the database.

load/1

Calls the AES.decrypt/1 function so data is decrypted when it is read from the database.

load/2

Calls the AES.decrypt/2 function so we can decrypt the ciphertext using a specific encryption key. Note: Ecto does not invoke this function directly, we are using it in our user.ex file. (see below)

Note: the load/2 function is not required for Ecto Type compliance. Further reading: https://hexdocs.pm/ecto/Ecto.Type.html

Your encrypted_field.ex Custom Ecto Type should look like this: lib/encryption/encrypted_field.ex try to write the tests for the callback functions, if you get "stuck", take a look at: test/lib/encrypted_field_test.exs

5. Use EncryptedField Ecto Type in User Schema

Now that we have defined a Custom Ecto Type EncryptedField, we can use the Type in our User Schema. Add the following line to "alias" the Type and a User in the lib/encryption/user.ex file:

alias Encryption.{EncryptedField, User}

Update the lines for :email and :name in the schema
from:

schema "users" do
  field :email, :binary
  field :email_hash, :binary
  field :key_id, :integer
  field :name, :binary
  field :password_hash, :binary

  timestamps()
end

To:

schema "users" do
  field :email, EncryptedField
  field :email_hash, :binary
  field :key_id, :integer
  field :name, EncryptedField
  field :password_hash, :binary

  timestamps()
end

We need to make two further changes:

First, we need a function to encrypt the :email and :name fields. In the user.ex file add the encrypt_fields/1

defp encrypt_fields(changeset) do
  case changeset.valid? do
    true ->
      {:ok, encrypted_email} = EncryptedField.dump(changeset.data.email)
      {:ok, encrypted_name} = EncryptedField.dump(changeset.data.name)
      changeset
      |> put_change(:email, encrypted_email)
      |> put_change(:name, encrypted_name)
    _ ->
      changeset
  end
end

Second, we need to update the changeset function to include a line calling the encrypt_fields/1 function:
From:

def changeset(user, attrs) do
  user
  |> cast(attrs, [:name, :email, :email_hash])
  |> validate_required([:name, :email, :email_hash])
end

To:

def changeset(%User{} = user, attrs \\ %{}) do
  user
  |> Map.merge(attrs) # merge any attributes into
  |> cast(attrs, [:name, :email])
  |> validate_required([:name, :email])
  |> encrypt_fields   # encrypt the :name and :email fields prior to DB insert
end

Adding |> Map.merge(attrs) to the changeset function will merge any additional attributes before further checks are performed and adding |> encrypt_fields will encrypt the :name and :email fields prior to the user being inserted into the database.

Note we have only added the code to encrypt the :name and :email fields on the changeset. We still need to decrypt the data when it is retrieved from the database. Decryption on data retrieval is covered below.

6. Create HashField Ecto Type for Hashing Email Address

We already added the the hash/1 function to (SHA256) hash the email address above in step 3.4,
now we are going to use it in an Ecto Type.

As we did for the EncryptedField Ecto Type in section 4 (above), the HashField needs the same four "ecto callbacks":

  • type/0 - :binary is appropriate for hashed data
  • cast/1 - Cast any data type to_string before hashing it. (the hashed data will be stored as :binary type)
  • dump/1 Calls the hash/1 function we defined in section 3.4 (above).
  • load/1 returns the {:ok, value} tuple (unmodified) because a hash cannot be "undone".

The code is pretty straightforward. Update the lib/encryption/hash_field.ex file to:

defmodule Encryption.HashField do
  @behaviour Ecto.Type

  def type, do: :binary

  def cast(value) do
    {:ok, to_string(value)}
  end

  def dump(value) do
    {:ok, hash(value)}
  end

  def load(value) do
    {:ok, value}
  end

  def hash(value) do
    :crypto.hash(:sha256, value <> get_salt(value))
  end

  # Get/use Phoenix secret_key_base as "salt" for one-way hashing Email address
  # use the *value* to create a *unique* "salt" for each value that is hashed:
  defp get_salt(value) do
    secret_key_base =
      Application.get_env(:encryption, EncryptionWeb.Endpoint)[:secret_key_base]
    :crypto.hash(:sha256, value <> secret_key_base)
  end
end

7. Use HashField Ecto Type in User Schema

First add the alias for HashField near the top of the lib/encryption/user.ex file. e.g:

alias Encryption.{User, Repo, EncryptedField, HashField}

Next, in the lib/encryption/user.ex file, update the lines for email_hash in the users schema
from:

schema "users" do
  field :email, EncryptedField
  field :email_hash, :binary
  field :key_id, :integer
  field :name, EncryptedField
  field :password_hash, :binary
  timestamps()
end

To:

schema "users" do
  field :email, EncryptedField
  field :email_hash, HashField
  field :key_id, :integer
  field :name, EncryptedField
  field :password_hash, :binary

  timestamps()
end

Then we need to create a function to perform the hashing of :email field:

defp set_hashed_fields(changeset) do
  case changeset.valid? do
    true ->
      changeset
      |> put_change(:email_hash, HashField.hash(changeset.data.email))
    _ ->
      changeset # return unmodified
  end
end

Finally, add the set_hashed_fields/1 function call in changeset/2 pipeline from:

def changeset(%User{} = user, attrs \\ %{}) do
  user
  |> Map.merge(attrs) # merge any attributes into
  |> cast(attrs, [:name, :email])
  |> validate_required([:name, :email])
  |> encrypt_fields   # encrypt the :name and :email fields prior to DB insert
end

To:

def changeset(%User{} = user, attrs \\ %{}) do
  user
  |> Map.merge(attrs)
  |> cast(attrs, [:name, :email])
  |> validate_required([:name, :email])
  |> set_hashed_fields              # set the email_hash field
  |> unique_constraint(:email_hash) # check email_hash is not already in DB
  |> encrypt_fields
end

We should test this new functionality. Create the file test/lib/user_test.exs and add the following:

  test "inserting a user sets the :email_hash field" do
    user = Repo.insert! User.changeset(%User{}, @valid_attrs)
    assert user.email_hash == Encryption.HashField.hash(@valid_attrs.email)
  end

  test "changeset validates uniqueness of email through email_hash" do
    Repo.insert! User.changeset(%User{}, @valid_attrs) # first insert works.
    # Now attempt to insert the *same* user again:
    {:error, changeset} = Repo.insert User.changeset(%User{}, @valid_attrs)
    {:ok, message} = Keyword.fetch(changeset.errors, :email_hash)
    msg = List.first(Tuple.to_list(message))
    assert "has already been taken" == msg
  end

For the full user tests please see: test/user/user_test.exs

8. Create PasswordField Ecto Type for Hashing Email Address

We already added the the hash_password/1 function in step 3.5, now we are going to use it in an Ecto Type.

As for the EncryptedField and HashField Ecto Type in section 4 (above), the PasswordField needs the same four "ecto callbacks":

  • type/0 - :binary is appropriate for hashed data
  • cast/1 - Cast any data type to_string before hashing it. (the hashed data will be stored as :binary type)
  • dump/1 Calls the hash_password/1 function we defined in section 3.5 (above).
  • load/1 returns the {:ok, value} tuple (unmodified) because a hash cannot be "undone".

The code is pretty straightforward. Update the lib/encryption/password_field.ex file to:

defmodule Encryption.PasswordField do
  @behaviour Ecto.Type

  def type, do: :binary

  def cast(value) do
    {:ok, to_string(value)}
  end

  def dump(value) do
    {:ok, hash_password(value)}
  end

  def load(value) do
    {:ok, value}
  end

  def hash_password(value) do
    Argon2.Base.hash_password(to_string(value),
      Argon2.gen_salt(), [{:argon2_type, 2}])
  end

  def verify_password(password, stored_hash) do
    Argon2.verify_pass(password, stored_hash)
  end
end

9. Use PasswordField Ecto Type in User Schema

As before, we need to use the PasswordField in our User Schema. Remember to alias the module at the top of the lib/encryption/user.ex file. e.g:

alias Encryption.{User, Repo, EncryptedField, HashField, PasswordField}

Now we simply extend the set_hashed_fields/1 function we defined in part 7 (above) to set the :password_hash field on the changeset. From:

defp set_hashed_fields(changeset) do
  case changeset.valid? do
    true ->
      changeset
      |> put_change(:email_hash, HashField.hash(changeset.data.email))
    _ ->
      changeset # return unmodified
  end
end

To:

defp set_hashed_fields(changeset) do
  case changeset.valid? do
    true ->
      changeset
      |> put_change(:email_hash, HashField.hash(changeset.data.email))
      |> put_change(:password_hash,
        PasswordField.hash_password(changeset.data.password))
    _ ->
      changeset # return unmodified
  end
end

That's it.

For the full user.ex code see: lib/encryption/user.ex and tests please see: test/user/user_test.exs


10. Refactor set_hashed_fields/1 and encrypt_fields/1...?

One of the best ways to confirm that you understood the code is to attempt to refactor it.

This step is optional (you can skip it if you're not confident with your Elixir skills), however it is recommended you at least read through it.

Note: in practice we don't tend to re-factor our code until we have shipped it, encountered a "bottleneck" (a need for optimisation) or we need to extend some code and want to make it "DRY" first.

Remember: it's only "refactoring" if there are complete tests otherwise it's "roulette"! (changing code when you don't have tests, will almost always result in bugs because without tests, not all test cases are considered ...)

In order for this refactor to succeed we need to follow these 4 steps:

  1. Do not touch the tests.
  1. Update the Schema (to ensure the data that needs to be hashed is not encrypted before we try to hash it!)
  2. Create a "generic" function to perform all our data transformations that will replace set_hashed_fields/1 and encrypt_fields/1.
  3. Update the changeset/2 function to use the new function and remove the calls to set_hashed_fields/1 and encrypt_fields/1.

10.1 Ensure All Tests Pass

Typically we will create git commit (if we don't already have one) for the "known state" where the tests were passing (before starting the refactor).

The commit before refactoring the example is: https://github.com/dwyl/phoenix-ecto-encryption-example/tree/3659399ec32ca4f07f45d0552b9cf25c359a2456

The corresponding Travis-CI build for this commit is: https://travis-ci.org/dwyl/phoenix-ecto-encryption-example/jobs/379887597#L833

Note: if you are new to Travis-CI see: https://github.com/dwyl/learn-travis

10.2 Re-order :email_hash Field in User Schema

We need to re-order the fields in the User schema so that :email_hash comes before :email so that the email address is not encrypted before being hashed whereby the hash would always be different!
From:

schema "users" do
  field :email, EncryptedField # :binary
  field :email_hash, Encryption.HashField # :binary
  field :key_id, :integer
  field :name, EncryptedField # :binary
  field :password, :binary, virtual: true # virtual means "don't persist"
  field :password_hash, Encryption.PasswordField # :binary

  timestamps() # creates columns for inserted_at and updated_at timestamps. =)
end

To:

schema "users" do
  field :email_hash, HashField # :binary
  field :email, EncryptedField # :binary
  field :key_id, :integer
  field :name, EncryptedField # :binary
  field :password, :binary, virtual: true # virtual means "don't persist"
  field :password_hash, PasswordField # :binary

  timestamps() # creates columns for inserted_at and updated_at timestamps. =)
end

10.3 Create One Generic (DRY) Function that Replaces Two Specific (WET)

In the user.ex file we have two functions that perform similar tasks, preparing data to be inserted into the database. Specifically: set_hashed_fields/1 and encrypt_fields/1 which perform hashing and encryption respectively.

defp prepare_fields(changeset) do
  case changeset.valid? do # don't bother transforming the data if invalid.
    true ->
      struct = changeset.data.__struct__  # get name of Ecto Struct. e.g: User
      fields = struct.__schema__(:fields) # get list of fields in the Struct
      # create map of data transforms stackoverflow.com/a/29924465/1148249
      changes = Enum.reduce fields, %{}, fn field, acc ->
        type = struct.__schema__(:type, field)
        # only check the changeset if it's "valid" and
        if String.contains? Atom.to_string(type), "Encryption." do
          primary = case type do
            Encryption.HashField -> # "primary" field for :email_hash is :email
              :email
            Encryption.PasswordField ->
              :password
            _ ->
             field
          end
          data = Map.get(changeset.data, primary)    # get plaintext data
          {:ok, transformed_value} = type.dump(data) # dump (encrypt/hash)
          Map.put(acc, field, transformed_value)     # assign key:value to Map
        else
          acc  # always return the accumulator to avoid "nil is not a map!"
        end
      end
      %{changeset | changes: changes} # apply the changes to the changeset
    _ ->
    changeset # return the changeset unmodified for the next function in pipe
  end
end

This function uses "type introspection" to determine which fields are on the Users* struct (schema) we know that hashed fields need the plaintext data so we return the primary field for :email and :password. then loops through those fields and determines what dump function needs to be applied. Finally we apply the changes to the changeset.

10.4 Update changeset/2 function to use prepare_fields/1

The last step is the easiest one. simply update the changeset/2 function, from:

def changeset(%User{} = user, attrs \\ %{}) do
  user
  |> Map.merge(attrs)
  |> cast(attrs, [:name, :email])
  |> validate_required([:name, :email])
  |> set_hashed_fields              # set the email_hash field
  |> unique_constraint(:email_hash) # check email_hash is not already in DB
  |> encrypt_fields
end

To:

def changeset(%User{} = user, attrs \\ %{}) do
  user
  |> Map.merge(attrs)
  |> cast(attrs, [:name, :email])
  |> validate_required([:name, :email])
  |> prepare_fields # hash and/or encrypt the personal data before db insert!
  |> unique_constraint(:email_hash) # only after the email has been hashed!
end

Done! Re-run the tests! you should see: https://travis-ci.org/dwyl/phoenix-ecto-encryption-example/builds/380557211#L833

The user.ex file now has fewer lines of code which are arguably more maintainable. The end state of the file after the refactor: user.ex

Conclusion

We have gone through how to create custom Ecto Types in order to define our own functions for handling (transforming) specific types of data.

Our hope is that you have understood the flow.

We plan to extend this tutorial include User Interface please "star" the repo if you would find that useful.



How To Generate AES Encryption Keys?

Encryption keys should be the appropriate length (in bits) as required by the chosen algorithm.

An AES 128-bit key can be expressed as a hexadecimal string with 32 characters.
It will require 24 characters in base64.

An AES 256-bit key can be expressed as a hexadecimal string with 64 characters.
It will require 44 characters in base64.

see: https://security.stackexchange.com/a/45334/117318

Open iex in your Terminal and paste the following line (then press enter)

:crypto.strong_rand_bytes(32) |> :base64.encode

You should see terminal output similar to the following:

elixir-generate-encryption-key

We generated 3 keys for demonstration purposes:

  • "h6pUk0ZccS0pYsibHZZ4Cd+PRO339rMA7sMz7FnmcGs="
  • "nMd/yQpR0aoasLaq1g94FL/a+A+wB44JLko47sVQXMg="
  • "L+ZVX8iheoqgqb22mUpATmMDsvVGt/foAe/0KN5uWf0="

These two Erlang functions are described in:

Base64 encoding the bytes generated by strong_rand_bytes will make the output human-readable (whereas bytes are less user-friendly).



Useful Links, FAQ & Background Reading

Running a Single Test

To run a single test (e.g: while debugging), use the following syntax:

mix test test/user/user_test.exs:9

For more detail, please see: https://hexdocs.pm/phoenix/testing.html

Ecto Validation Error format

When Ecto changeset validation fails, for example if there is a "unique" constraint on email address (so that people cannot re-register with the same email address twice), Ecto returns the changeset with an errors key:

#Ecto.Changeset<
  action: :insert,
  changes: %{
    email: <<224, 124, 228, 125, 105, 102, 38, 170, 15, 199, 228, 198, 245, 189,
      82, 193, 164, 14, 182, 8, 189, 19, 231, 49, 80, 223, 84, 143, 232, 92, 96,
      156, 100, 4, 7, 162, 26, 2, 121, 32, 187, 65, 254, 50, 253, 101, 202>>,
    email_hash: <<21, 173, 0, 16, 69, 67, 184, 120, 1, 57, 56, 254, 167, 254,
      154, 78, 221, 136, 159, 193, 162, 130, 220, 43, 126, 49, 176, 236, 140,
      131, 133, 130>>,
    key_id: 1,
    name: <<2, 215, 188, 71, 109, 131, 60, 147, 219, 168, 106, 157, 224, 120,
      49, 224, 225, 181, 245, 237, 23, 68, 102, 133, 85, 62, 22, 166, 105, 51,
      239, 198, 107, 247, 32>>,
    password_hash: <<132, 220, 9, 85, 60, 135, 183, 155, 214, 215, 156, 180,
      205, 103, 189, 137, 81, 201, 37, 214, 154, 204, 185, 253, 144, 74, 222,
      80, 158, 33, 173, 254>>
  },
  errors: [email_hash: {"has already been taken", []}],
  data: #Encryption.User<>,
  valid?: false
>

The errors part is:

[email_hash: {"has already been taken", []}]

A tuple wrapped in a keyword list.

Why this construct? A changeset can have multiple errors, so they're stored as a keyword list, where the key is the field, and the value is the error tuple.
The first item in the tuple is the error message, and the second is another keyword list, with additional information that we would use when mapping over the errors in order to make them more user-friendly (though here, it's empty).
See the Ecto docs for add_error/4 and traverse_errors/2 for more information.

So to access the error message "has already been taken" we need some pattern-matching and list popping:

{:error, changeset} = Repo.insert User.changeset(%User{}, @valid_attrs)
{:ok, message} = Keyword.fetch(changeset.errors, :email_hash)
msg = List.first(Tuple.to_list(message))
assert "has already been taken" == msg

To see this in action run:

mix test test/user/user_test.exs:40

Stuck / Need Help?

If you get "stuck", please open an issue on GitHub: https://github.com/nelsonic/phoenix-ecto-encryption-example/issues describing the issue you are facing with as much detail as you can.



Credits

Inspiration/credit/thanks for this example goes to Daniel Berkompas @danielberkompas for his post:
https://blog.danielberkompas.com/2015/07/03/encrypting-data-with-ecto

Daniel's post is for Phoenix v0.14.0 which is quite "old" now ...
therefore a few changes/updates are required.
e.g: There are no more "Models" in Phoenix 1.3 or Ecto callbacks.

Also his post only includes the "sample code" and is not a complete example
and does not explain the functions & Custom Ecto Types.
Which means anyone following the post needs to manually copy-paste the code ... and "figure out" the "gaps" themselves to make it work.
We prefer to include the complete "end state" of any tutorial (not just "samples")
so that anyone can git clone and run the code locally to fully understand it.

Still, props to Daniel for his post, a good intro to the topic!

phoenix-ecto-encryption-example's People

Contributors

nelsonic avatar robstallion avatar simonlab avatar cleop avatar danwhy avatar

Watchers

Chetan Shenoy avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.