Subdomain based SaaS with Phoenix - Part 1

Eric Sullivan

Tagged as severus, subdomain, saas, oauth2, elixir, authentication

Yesterdays announcement regarding Organizations and OAuth2 hid a lot of complexity. It was actually a net change of about 10k LOC (and at one point double that), but I used a neat trick to re-use my LiveView test suite. I wanted to cover some of the technical changes, as there have not been too many posts about using Phoenix with Subdomains (when I searched, the second result was an article from 2015).

Hosting

I’m using Namecheap for DNS and Fly.io for hosting. They both have links to describe the wildcard DNS Setup and SSL Certificate process. For local development, I just added a bunch of /etc/hosts entries:

127.0.0.1 severus.local
127.0.0.1 org-1.severus.local
127.0.0.1 org-2.severus.local
...

A Different top-level Domain

One interesting decision I made was to host on a slightly different domain (getseverus.org vs getseverus.com). This was pragmatic, as I was already using a subdomain (hydra), and I wasn’t sure what names I’d want to reserve for future use. Many products use reserved names, such as Slack and Gitlab, and there are recommended lists. I skipped that decision by using a different domain. I did reserve intever and severus though.

This did cause a small issue w/ LiveView, as it did not like a host that was different than the endpoint. You can fix that by setting check_origin on the Endpoint

  host = System.fetch_env!("HOST")
  org_host = System.fetch_env!("ORG_HOST")

  config :severus_web, SeverusWeb.Endpoint,
    url: [host: host, port: 443, scheme: "https"],
    org_url: [host: org_host, port: 443, scheme: "https"],
    check_origin: ["https://#{host}", "https://*.#{org_host}"]

in development I set both HOST and ORG_HOST to severus.local to match /etc/hosts

Recognizing the Subdomain

I also added the org_url configuration to the Endpoint so I could reference it in my subdomain plug. That 2015 article I mentioned provided the base for this module:

defmodule SeverusWeb.Plug.Subdomain do
  import Plug.Conn

  @doc false
  def init(options), do: options

  @doc false
  def call(conn, _opts) do
    case get_subdomain(conn.host) do
      subdomain when byte_size(subdomain) > 0 ->
        case Severus.Accounts.get_organization_by_name(subdomain) do
          {:ok, organization} ->
            organization =
              case organization.oauth_client do
                %Severus.OAuth.Client{managed: true} = client ->
                  %{
                    organization
                    | client_id: client.client_id,
                      client_secret: client.managed_client_secret
                  }

                nil ->
                  organization
              end

            conn
            |> assign(:current_organization, organization)

          {:error, :not_found} ->
            conn
            |> put_resp_content_type("text/html")
            |> send_resp(404, "Organization Not Found - #{subdomain}")
            |> halt()
        end

      _ ->
        conn
        |> assign(:current_organization, nil)
    end
  end

  defp get_subdomain(host) do
    subdomain_host = SeverusWeb.Endpoint.config(:org_url)[:host]

    if host =~ subdomain_host do
      String.replace(host, ~r/.?#{subdomain_host}/, "")
    else
      ""
    end
  end
end

I installed it in my Endpoint, right before the router:

plug SeverusWeb.Plug.Subdomain
plug SeverusWeb.Router

The only significant difference is the oauth_client case statement, which I’m using to provide dynamic credentials to a Überauth client. I’ll open source that soon, I just need to rename the code base from Spidersilk to Severus (which was the code-name for this product when it was still a baby repo)

Adding the current_organization to the LiveView

Once the current_organization was set, I also need to make sure it was available in the LiveView. I saw an example of passing the host using get_connect_params/1 or handle_params, but I already had the organization, and just wanted to pass that (or it’s ID) to the LiveView. I used the live_session/3 on_mount and session options in the router:

  live_session :default,
    session: {SeverusWeb.InitAssigns, :session, []},
    on_mount: {SeverusWeb.InitAssigns, :user} do
    scope "/", SeverusWeb do
      pipe_through [:browser, :require_authenticated_user]

      ...
      routes that require an authenticated used
      ...

    end
  end

With this implementation:

defmodule SeverusWeb.InitAssigns do
  import Phoenix.LiveView

  alias Severus.Accounts
  alias Severus.Accounts.User
  alias SeverusWeb.Router.Helpers, as: Routes

  # current_user is optional
  def on_mount(:default, _params, session, socket) do
    socket =
      socket
      |> assign_new(:current_organization, fn ->
        find_organization(session)
      end)

    socket =
      socket
      |> assign_new(:current_user, fn ->
        find_current_user(session, socket.assigns.current_organization)
      end)

    {:cont, socket}
  end

  # current_user is required
  def on_mount(:user, _params, session, socket) do
    socket =
      socket
      |> assign_new(:current_organization, fn ->
        find_organization(session)
      end)

    socket =
      socket
      |> assign_new(:current_user, fn ->
        find_current_user(session, socket.assigns.current_organization)
      end)

    case socket.assigns.current_user do
      # An organization can not sign-in
      %User{organization: false} ->
        {:cont, socket}

      # No user, redirect to sign-in path
      nil ->
        {:halt,
         socket
         |> put_flash(:error, "You must log in to access this page.")
         |> redirect(to: Routes.user_session_path(socket, :new))}
    end
  end

  defp find_current_user(session, organization) do
    with user_token when not is_nil(user_token) <- session["user_token"],
         %User{} = user <-
           Accounts.get_user_by_session_token(
             user_token,
             organization: organization
           ) do
      user
    end
  end

  def find_organization(session) do
    case session["current_organization_id"] do
      nil ->
        nil

      organization_id ->
        Severus.Accounts.get_organization!(organization_id)
    end
  end

  # copy conn assigns to session
  def session(conn) do
    if conn.assigns[:current_organization] do
      %{
        "current_organization_id" => conn.assigns.current_organization.id,
      }
    else
      %{
        "current_organization_id" => nil,
      }
    end
  end
end

The usage of the session is a neat trick. Quoted from the documentation:

“All session data currently in the connection is automatically available in LiveViews. You can use this option to provide extra data. Remember all session data is serialized and sent to the client, so you should always keep the data in the session to a minimum.”

LiveView renders twice, once with access to the Plug Conn, but then it re-renders and does not have access to the conn. By using the session option, you can selectively add data from the conn to the second rendering. That pairs with assign_new, which will be populated with the current_organization on initial render (via the plug conn), and then populated on the second render by using the current_organization_id session value.

The user_token is a little different as it’s actually stored in the session, so it doesn’t need to be copied over. That code is mostly from the mix phx.gen.auth task. The only notable difference is that get_user_by_session_token was modified to accept an organization. I added a foreign key to the organization on the user_token so I could track which subdomain the session was for (and make sure it wasn’t used for another domain). It also allowed for a dashboard that would show a user which subdomains they’ve signed into and revoke individual tokens.

Updating the Views with an Owner

With this setup, all my LiveViews have current_organization in their assigns, but it might be blank. Making this work consistenly involved adding an owner to the assigns and using that in place of the current_user. For example, here’s a mount:

  @impl true
  def mount(_params, _session, socket) do
    instrument(__MODULE__, "mount", socket, fn ->
      current_user = socket.assigns.current_user
      current_organization = socket.assigns.current_organization

      owner = current_organization || current_user

      if connected?(socket) && owner do
        [
          "user-address.updated.user-id:#{owner.id}",
          "user-address.deleted.user-id:#{owner.id}"
        ]
        |> Enum.each(&Phoenix.PubSub.subscribe(Severus.PubSub, &1))
      end

      {:ok, socket}
    end)
  end

From this example you can likely tell that an organization is just a user with a boolean organization flag set. This made the implementation much easier, as all the foreign keys point to the same users table. The alternative would be some type of owner polymorphism, like a contact has a userid _or an organization_id). As this code base is already pretty large for a single developer, I wanted to keep it simple.

LOC

As an aside, I just ran:

git ls-files | xargs wc -l

it’s at 81k…

We’ll revisit the monolothic vs microservices architecture soon…

Updating the Views with an Owner (Continued)

Continuing to the new action:

  defp apply_action(socket, :new, _params) do
    current_user = socket.assigns.current_user
    current_organization = socket.assigns.current_organization

    owner = current_organization || current_user

    user_address = %UserAddress{user_id: owner.id}

    with :ok <- Bodyguard.permit(Policy, :create, current_user, {current_organization, user_address}) do
      changeset = External.change_user_address(user_address)

      socket
      |> assign(:changeset, changeset)
    else
      {:error, reason} -> handle_error(socket, reason)
    end
  end

It’s once again using the idea of the owner to set the user_id on the record being created. As I’m writing this I realize I should have added a current_owner to the assigns and saved myself some duplication. Suddenly you’re thinking you know why it’s 81k lines of code, and I feel a need to justify myself, so I’ll cover some meta-programming when I discuss the test suite (in part 2), but before I finish I need to cover authorization.

Authorization Bodyguard

You can see I’m using Bodyguard. This approach didn’t feel quite right, but it was effective. The user is still the 3rd parameter, but the 4th parameter is now a tuple with an organization (or nil) in the first element. That turned the policies into something like:

  def authorize(
        action,
        %User{id: user_id} = _user,
        {%User{organization: true, id: organization_id} = organization,
         %UserAddress{user_id: organization_id}}
      )
      when action in [:show] do
    organization.owner_id == user_id
  end
  def authorize(action, %User{id: user_id} = _user, {nil, %UserAddress{} = user_address}) when action in [:show] do
    user_address.user_id == user_id
  end

For an organization, it’s checking the resource owner is the current_organization (by using the same variable name when pattern matching). Then it’s checking the current_user is the organization owner. The actual code has some more complex joins related to permissions (for an upcoming enterprise release), but I didn’t think showing that here added anything. For the nil organization case, it checks that the resources owner is the current_user.

Conclusion

This post is getting somewhat lengthy, and it looks like I need to update my CSS, so I’ll wrap up here. I do have another post planned where I’ll cover the macro I used to keep the test suite somewhat managable, as well as some more thoughts on multi-tenancy and why I choose this approach. Github launched Organization Discussions yesterday, so I enabled that and created a discussion for this article. Thanks all