Why Developer Marketplaces & Software Outsourcers No Longer Work

About Me: I’ve created 3 software companies: Five9 (IPO), DoctorBase (cash sale) and JetBridge (current co), much of it using offshore software developers. I’ve made a lot of expensive mistakes you should avoid.

TLDR: Adopt something like our Developer Handbook and give it to every developer before the interview. A majority of offshore developers after reading it decline to interview with us, which is nice because it saves my HR team lots of time by automatically weeding out the bottom 75% (marketplaces and outsourcing companies recruit the bottom 50%).

Most customers who go to marketplaces or outsourcers for competent software developers do not get what they are advertised, often leading to expensive failure for non-technical founders and professional risk for enterprise managers. Here’s how the schemes work (and how to solve for them) –

“Hire ex-Googlers on our marketplace!”

Why would a software engineer who makes $250k+ for FAANG companies (and can WFH) lurk on marketplaces to work on an idiot’s idea with no stability or benefits at a fraction of the wages?

They don’t.

This marketing gimmick works because it’s human nature to go online and believe we found a $100 bill for $20, but the international labor market, especially post-Covid, has become extremely efficient. Even average developers get multiple messages from recruiters every week – they no longer need to go to marketplaces. And they’re mostly remote work offers.

So what kind of developers go onto marketplaces looking for work?

The kind that don’t know how to reverse a string (basic stuff) or will pull a bait-and-switch on you (the developer you talk to will not be the one doing the work).

Marketplaces can be very effective for sourcing UIUX designers, graphic design, mechanical turk tasks and more, but increasingly competent software developers are not on them.

“Two weeks free if you’re not satisfied!” is often a marketing gimmick by marketplaces. Two weeks is simply not enough (especially for non-technical founders) to determine if a developer is good.

If you find a developer you like on a marketplace, ask them if they’re willing to take a technical pair programming session with an onshore CTO you trust.

Try it – the response is often amusing.

And if you get an amusing response, walk away with your money still in your bank account.

“We take you from MVP to IPO!”

Almost every software outsourcer I know (except the huge ones) are trying to build their own software apps. Outsourcing can be a grueling business and many SMB outsourcers dream of becoming a SaaS company. If they knew how to build successful MVPs they wouldn’t still be in outsourcing.

Because outsourcing is a labor arbitrage game, outsourcing companies try to pay a 3:1 ratio from fees to wages. Meaning if an outsourcing company charges $75/hour, they’re paying the developers about $25/hour.

So why would a competent developer work for an outsourcer that takes 66% of the revenue created when they’re doing all of the work post-sale?

They don’t.

Enterprise outsourcing companies are places for semi-competent software developers to hide on over-staffed teams (and not give a shit), the exact kind you don’t want. Besides, the large outsourcing companies only take “digital transformation” projects that have $1M+ budgets (just look at the earnings reports of the publicly traded outsourcing companies).

SMB outsourcing companies are a great place for project managers and junior developers to cut their teeth (on your dollar).

This gimmick works because enterprises often have legacy projects that smart, ambitious software developers don’t want to work on (like 20 year old banking applications built on J2EE), and SMB outsourcing companies hire a “Product Manager” that can speak decent english (their developers will mostly not).

Most MVPs do not need a Product/Project Manager, they need a competent Tech Lead (a senior fullstack engineer who has led teams before). Ask any experienced Tech Lead and they will confirm.

If an SMB outsourcing company quotes you for both the developers’ time as well as a Project/Product Manager, walk away. It’s a technique used to hide the fact that they don’t have competent developers.

BTW, most competent developers speak English (since most of the educational content, workshops, conferences, etc are in English). Which also makes sense because good developers are supposed to be good at learning languages.

So what’s the solution(s)?

In my experience, and according to all of our data, anywhere from 3% – 9% of offshore software developers are commercially competent. Since software development is a non-licensed profession that often makes 10x more than most jobs in emerging economies, this makes sense.

Outsource the technical interviews to a highly competent 3rd party.

If you’re a technical manager working in an enterprise, you likely don’t have the time (or risk profile) to conduct 11 – 30 technical interviews for every 1 offer to an offshore developer. And if your company needs to make 2 offers for every 1 accepted offer, obviously double that time-number. There are cost savings to be had going offshore, but the risk profile is much greater.

If you’re a non-technical founder, and if you have the money, first hire a technical co-founder or technical employee onshore (or an elite dev offshore). I get it, believe me, it’s incredibly difficult and expensive, it’s easier to just hire someone on a marketplace and test your MVP idea.

The problem is this – it is highly unlikely your idea is the one with Product Market Fit. According to David Rusenko (sold his startup Weebly for $365M to Square) it will likely take 20-30 iterations of your idea before achieving a product that has any chance of making you money. This requires an in-house technical founder/employee who can then manage an offshore team. Relying solely on an offshore team just won’t work.

In 23 years of being in startups, I’ve never seen a non-technical founder without a technical founder ‘get rich’ by partnering with an outsourcing company. Not once.

And if you can’t find a technical founder, honestly ask yourself – how will you find your first 10 enterprise customers or your first 10,000 consumer users? How will you sell investors or sell your company?

And if you’re an enterprise hiring manager, outsource the professional risk by having a highly competent 3rd party conduct the first round of technical interviews for your offshore dev team.

Feel free to disagree (I’m always open to learning) at john.k@jetbridge.com

Multipart-Encoded and Python Requests

It’s easy to find on the web many examples of how to send multipart-encoded data like images/files using python requests. Even in request’s documentation there’s a section only for that. But I struggled a couple days ago about the Content-type header.

The recommended header for multipart-encoded files/images is multipart/form-data and requests already set it for us automatically, using the parameter “files”. Here’s an example taken from requests documentation:

>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files)
>>> r.text
{
  ...
  "files": {
    "file": "<censored...binary...data>"
  },
  ...
}

As you can see, you don’t even need to set the header. Moving on, we often need custom headers, like x-api-key or something else. So, we’d have:

>>> headers = {'x-auth-api-key': <SOME_TOKEN>, 'Content-type': 'multipart/form-data'}
>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files, headers=headers)
>>> r.text
{
  ...
  "files": {
    "file": "<censored...binary...data>"
  },
  ...
}

Right? Unfortunately, not. Most likely that you will receive an error like below:

ValueError: Invalid boundary in multipart form: b'' 

or

{'detail': 'Multipart form parse error - Invalid boundary in multipart: None'}

Or even from a simple Nodejs server, because it’s not a matter of language or framework. In the case of the NodeJs server, you will get an undefined in request.files because is not set.

So, what’s the catch?

The catch here is even when we need custom headers, we don’t need to set the 'Content-type': 'multipart/form-data', because otherwise requests won’t do its magic for us setting the boundary field.

For multipart entities the boundary directive is required, which consists of 1 to 70 characters from a set of characters known to be very robust through email gateways, and not ending with white space. It is used to encapsulate the boundaries of the multiple parts of the message. Often, the header boundary is prepended with two dashes and the final boundary has two dashes appended at the end. (source)

Here’s an example of a request containing multipart/form-data:

Example of a request containing multipart/form-data

So, there it is. When using requests to POST file and/or images, use the files param and “forget” the Content-type, because the library will handle it for you.

Nice, huh? 😄
Not when I was suffering. 😒

Building a REST API with Django REST Framework

Let’s talk about a very powerful library to build APIs: the Django Rest Framework, or just DRF!

DRF logo

With DRF it is possible to combine Python and Django in a flexible manner to develop web APIs in a very simple and fast way.

Some reasons to use DRF:

  • Serialization of objects from ORM sources (databases) and non-ORM (classes).
  • Extensive documentation and large community.
  • It provides a navigable interface to debug its API.
  • Various authentication strategies, including packages for OAuth1 and OAuth2.
  • Used by large corporations such as: Heroku, EventBrite, Mozilla and Red Hat.

And it uses our dear Django as a base!

That’s why it’s interesting that you already have some knowledge of Django.

Introduction

The best way of learning a new tool is by putting your hand in the code and making a small project.

For this post I decided to join two things I really like: code and investments!

So in this post we will develop an API for consulting a type of investment: Exchange Traded Funds, or just ETFs.

Do not know what it is? So here it goes:

An exchange traded fund (ETF) is a type of security that tracks an index, sector, commodity, or other asset, but which can be purchased or sold on a stock exchange the same as a regular stock. An ETF can be structured to track anything from the price of an individual commodity to a large and diverse collection of securities. ETFs can even be structured to track specific investment strategies. (Retrieved from: Investopedia)

That said, let’s start at the beginning: let’s create the base structure and configure the DRF.

Project Configuration

First, let’s start with the name: let’s call it ETFFinder.

So let’s go to the first steps:

# Create the folder and access it
mkdir etffinder && cd etffinder

# Create virtual environment with latests installed Python version
virtualenv venv --python=/usr/bin/python3.8

# Activate virtual environment
source venv/bin/activate

# Install Django and DRF
pip install django djangorestframework

So far, we:

  • Created the project folder;
  • Created a virtual environment;
  • Activated the virtual environment and install dependencies (Django and DRF)

To start a new project, let’s use Django’s startproject command:

django-admin startproject etffinder .

This will generate the base code needed to start a Django project.

Now, let’s create a new app to separate our API responsibilities.

Let’s call it api.

We use Django’s django-admin startapp command at the root of the project (where the manage.py file is located), like this:

python3 manage.py startapp api

Also, go ahead and create the initial database structure with:

python3 manage.py migrate

Now we have the following structure:

File structure
File structure

Run the local server to verify everything is correct:

python3 manage.py runserver

Access http://localhost:8000 in your browser ans you should see the following screen:

Default webpage
Django’s default webpage

Now add a superuser with the createsuperuser command (a password will be asked):

python manage.py createsuperuser --email admin@etffinder.com --username admin

There’s only one thing left to finish our project’s initial settings: add everything to settings.py.

To do this, open the etffinder/settings.py file and add the api, etffinder and rest_framework apps (required for DRF to work) to the INSTALLED_APPS setting, like this:

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'rest_framework',
    'etffinder',
    'api'
]

Well done!

With that we have the initial structure to finally start our project!

Modeling

The process of developing applications using the Django Rest Framework generally follows the following path:

  1. Modeling;
  2. Serializers;
  3. ViewSets;
  4. Routers

Let’s start with Modeling.

Well, as we are going to make a system for searching and listing ETFs, our modeling must reflect fields that make sense.

To help with this task, I chose some parameters from this Large-Cap ETF’s Table, from ETFDB website:

ETFDB table
ETFDB ETF table

Let’s use the following attributes:

  • Symbol: Fund identifier code.
  • Name: ETF name
  • Asset Class: ETF class.
  • Total Assets: Total amount of money managed by the fund.
  • YTD Price Change: Year-to-Date price change.
  • Avg. Daily Volume: Average daily traded volume.

With this in hand, we can create the modeling of the ExchangeTradedFund entity.

For this, we’ll use the great Django’s own ORM (Object-Relational Mapping).

Our modeling can be implemented as follows (api/models.py):

from django.db import models
import uuid


class ExchangeTradedFund(models.Model):
  id = models.UUIDField(
    primary_key=True,
    default=uuid.uuid4,
    null=False,
    blank=True)

  symbol = models.CharField(
    max_length=8,
    null=False,
    blank=False)

  name = models.CharField(
    max_length=50,
    null=False,
    blank=False)

  asset_class = models.CharField(
    max_length=30,
    null=False,
    blank=False)

  total_assets = models.DecimalField(
    null=False,
    blank=False,
    max_digits=14,
    decimal_places=2)

  ytd_price_change = models.DecimalField(
    null=False,
    blank=False,
    max_digits=5,
    decimal_places=2)

  average_daily_volume = models.IntegerField(
    null=False,
    blank=False)

With this, we need to generate the Migrations file to update the database.

We accomplish this with Django’s makemigrations command. Run:

python3 manage.py makemigrations api

Now let’s apply the migration to the Database with the migrate command. Run:

python3 manage.py migrate

With the modeling ready, we can move to Serializer!

Serializer

DRF serializers are essential components of the framework.

They serve to translate complex entities such as querysets and class instances into simple representations that can be used in web traffic such as JSON and XML and we name this process Serialization.

Serializers also serve to do the opposite way: Deserialization. This is done by transforming simple representations (like JSON and XML) into complex representations, instantiating objects, for example.

Let’s create the file where our API’s serializers will be.

Create a file called serializers.py inside the api/ folder.

DRF provides several types of serializers that we can use, such as:

  • BaseSerializer: Base class for building generic Serializers.
  • ModelSerializer: Helps the creation of model-based serializers.
  • HyperlinkedModelSerializer: Similar to ModelSerializer, however returns a link to represent the relationship between entities (ModelSerializer returns, by default, the id of the related entity).

Let’s use the ModelSerializer to build the serializer of the entity ExchangeTradedFund.

For that, we need to declare which model that serializer will operate on and which fields it should be concerned with.

A serializer can be implemented as follows:

from rest_framework import serializers
from api.models import ExchangeTradedFund


class ExchangeTradedFundSerializer(serializers.ModelSerializer):
  class Meta:
    model = ExchangeTradedFund
    fields = [
      'id',
      'symbol',
      'name',
      'asset_class',
      'total_assets',
      'ytd_price_change',
      'average_daily_volume'
    ]  

In this Serializer:

  • model = ExchangeTradedFund defines which model this serializer must serialize.
  • fields chooses the fields to serialize.

Note: It is possible to define that all fields of the model entity should be serialized using fields = ['__all__'], however I prefer to show the fields explicitly.

With this, we conclude another step of our DRF guide!

Let’s go to the third step: creating Views.

ViewSets

A ViewSet defines which REST operations will be available and how your system will respond to API calls.

ViewSets inherit and add logic to Django’s default Views.

Their responsibilities are:

  • Receive Requisition data (JSON or XML format)
  • Validate the data according to the rules defined in the modeling and in the Serializer
  • Deserialize the Request and instantiate objects
  • Process Business related logic (this is where we implement the logic of our systems)
  • Formulate a Response and respond to whoever called the API

I found a very interesting image on Reddit that shows the DRF class inheritance diagram, which helps us better understand the internal structure of the framework:

Django class inheritance diagram
DRF class inheritance diagram

In the image:

  • On the top, we have Django’s default View class.
  • APIView and ViewSet are DRF classes that inherit from View and bring some specific settings to turn them into APIs, like get() method to handle HTTP GET requests and post() to handle HTTP POST requests.
  • Just below, we have GenericAPIView – which is the base class for generic views – and GenericViewSet – which is the base for ViewSets (the right part in purple in the image).
  • In the middle, in blue, we have the Mixins. They are the code blocks responsible for actually implementing the desired actions.
  • Then we have the Views that provide the features of our API, as if they were Lego blocks. They extend from Mixins to build the desired functionality (whether listing, deleting, etc.)

For example: if you want to create an API that only provides listing of a certain Entity you could choose ListAPIView.

Now if you need to build an API that provides only create and list operations, you could use the ListCreateAPIView.

Now if you need to build an “all-in” API (ie: create, delete, update, and list), choose the ModelViewSet (notice that it extends all available Mixins).

To better understand:

  • Mixins looks like the components of Subway sandwiches 🍅🍞🍗🥩
  • Views are similar to Subway: you assemble your sandwich, component by component 🍞
  • ViewSets are like McDonalds: your sandwich is already assembled 🍔

DRF provides several types of Views and Viewsets that can be customized according to the system’s needs.

To make our life easier, let’s use the ModelViewSet!

In DRF, by convention, we implement Views/ViewSets in the views.py file inside the app in question.

This file is already created when using the django-admin startapp api command, so we don’t need to create it.

Now, see how difficult it is to create a ModelViewSet (don’t be amazed by the complexity):

from api.serializers import ExchangeTradedFundSerializer
from rest_framework import viewsets, permissions
from api.models import ExchangeTradedFund


class ExchangeTradedFundViewSet(viewsets.ModelViewSet):
  queryset = ExchangeTradedFund.objects.all()
  serializer_class = ExchangeTradedFundSerializer
  permission_classes = [permissions.IsAuthenticated]

That’s it!

You might be wondering?

Whoa, and where’s the rest?

All the code for handling Requests, serializing and deserializing objects and formulating HTTP Responses is within the classes that we inherited directly and indirectly.

In our class ExchangeTradedFundViewSet we just need to declare the following parameters:

  • queryset: Sets the base queryset to be used by the API. It is used in the action of listing, for example.
  • serializer_class: Configures which Serializer should be used to consume data arriving at the API and produce data that will be sent in response.
  • permission_classes: List containing the permissions needed to access the endpoint exposed by this ViewSet. In this case, it will only allow access to authenticated users.

With that we kill the third step: the ViewSet!

Now let’s go to the URLs configuration!

Routers

Routers help us generate URLs for our application.

As REST has well-defined patterns of structure of URLs, DRF automatically generates them for us, already in the correct pattern.

So, let’s use it!

To do that, first create the urls.py file in api/urls.py.

Now see how simple it is!

from rest_framework.routers import DefaultRouter
from api.views import ExchangeTradedFundViewSet


app_name = 'api'

router = DefaultRouter(trailing_slash=False)
router.register(r'funds', ExchangeTradedFundViewSet)

urlpatterns = router.urls

Let’s understand:

  • app_name is needed to give context to generated URLs. This parameter specifies the namespace of the added URLConfs.
  • DefaultRouter is the Router we chose for automatic URL generation. The trailing_slash parameter specifies that it is not necessary to use slashes / at the end of the URL.
  • The register method takes two parameters: the first is the prefix that will be used in the URL (in our case: http://localhost:8000/funds) and the second is the View that will respond to the URLs with that prefix.
  • Lastly, we have Django’s urlpatterns, which we use to expose this app’s URLs.

Now we need to add our api app-specific URLs to the project.

To do this, open the etffinder/urls.py file and add the following lines:

from django.contrib import admin
from django.urls import path, include

urlpatterns = [
  path('api/v1/', include('api.urls', namespace='api')),
  path('api-auth/', include('rest_framework.urls', namespace='rest_framework')),
  path('admin/', admin.site.urls),
]

Note: As a good practice, always use the prefix api/v1/ to maintain compatibility in case you need to evolve your api to V2 (api/v2/)!

Using just these lines of code, look at the bunch of endpoints that DRF automatically generated for our API:

URLHTTP MethodAction
/api/v1GETAPI’s root path
/api/v1/backgroundsGETListing of all elements
/api/v1/backgroundsPOSTCreation of new element
/api/v1/backgrounds/{lookup}GETRetrieve element by ID
/api/v1/backgrounds/{lookup}PUTElement Update by ID
/api/v1/backgrounds/{lookup}PATCHPartial update by ID (partial update)
/api/v1/backgrounds/{lookup}DELETEElement deletion by ID
Automatically generated routes.

Here, {lookup} is the parameter used by DRF to uniquely identify an element.

Let’s assume that a Fund has id=ef249e21-43cf-47e4-9aac-0ed26af2d0ce.

We can delete it by sending an HTTP DELETE request to the URL:

http://localhost:8000/api/v1/funds/ef249e21-43cf-47e4-9aac-0ed26af2d0ce

Or we can create a new Fund by sending a POST request to the URL http://localhost:8000/api/v1/funds and the field values ​​in the request body, like this:

{
  "symbol": "SPY",
  "name": "SPDR S&P 500 ETF Trust",
  "asset_class": "Equity",
  "total_assets": "372251000000.00",
  "ytd_price_change": "15.14",
  "average_daily_volume": "69599336"
}

This way, our API would return a HTTP 201 Created code, meaning that an object was created and the response would be:

{
  "id": "a4139c66-cf29-41b4-b73e-c7d203587df9",
  "symbol": "SPY",
  "name": "SPDR S&P 500 ETF Trust",
  "asset_class": "Equity",
  "total_assets": "372251000000.00",
  "ytd_price_change": "15.14",
  "average_daily_volume": "69599336"
}

We can test our URL in different ways: through Python code, through a Frontend (Angular, React, Vue.js) or through Postman, for example.

And how can I see this all running?

So let’s go to the next section!

Browsable interface

One of the most impressive features of DRF is its Browsable Interface.

With it, we can test our API and check its values in a very simple and visual way.

To access it, navigate in your browser to: http://localhost:8000/api/v1.

You should see the following:

DRF Browsable Interface
DRF Browsable Interface – API Root

Go there and click on http://127.0.0.1:8000/api/v1/funds!

The following message must have appeared:

{
  "detail": "Authentication credentials were not provided."
}

Remember the permission_classes setting we used to configure our ExchangeTradedFundViewSet?

It defined that only authenticated users (permissions.isAuthenticated) can interact with the API.

Click on the upper right corner, on “Log in” and use the credentials registered in the createsuperuser command, which we executed at the beginning of the post.

Now, look how this is useful! You should be seeing:

DRF Browsable Interface - ETF Form
DRF Browsable Interface – ETF Form

Play a little, add data and explore the interface.

When adding data and updating the page, an HTTP GET API request is triggered, returning the data you just registered:

DRF Browsable Interface - ETF List
DRF Browsable Interface – ETF List

Specific Settings

It is possible to configure various aspects of DRF through some specific settings.

We do this by adding and configuring the REST_FRAMEWORK to the settings.py settings file.

For example, if we want to add pagination to our API, we can simply do this:

REST_FRAMEWORK = {
  'DEFAULT_PAGINATION_CLASS': 'rest_framework.pagination.PageNumberPagination',
  'PAGE_SIZE': 10
}

Now the result of a call, for example, to http://127.0.0.1:8000/api/v1/funds goes from:

[
    {
        "id": "0e149f99-e5a5-4e3a-b89b-8b65ae7c6cf4",
        "symbol": "IVV",
        "name": "iShares Core S&P 500 ETF",
        "asset_class": "Equity",
        "total_assets": "286201000000.00",
        "ytd_price_change": "15.14",
        "average_daily_volume": 4391086
    },
    {
        "id": "21af5504-55bf-4326-951a-af51cd40a2f9",
        "symbol": "VTI",
        "name": "Vanguard Total Stock Market ETF",
        "asset_class": "Equity",
        "total_assets": "251632000000.00",
        "ytd_price_change": "15.20",
        "average_daily_volume": 3760095
    }
]

To:

{
    "count": 2,
    "next": null,
    "previous": null,
    "results": [
        {
            "id": "0e149f99-e5a5-4e3a-b89b-8b65ae7c6cf4",
            "symbol": "IVV",
            "name": "iShares Core S&P 500 ETF",
            "asset_class": "Equity",
            "total_assets": "286201000000.00",
            "ytd_price_change": "15.14",
            "average_daily_volume": 4391086
        },
        {
            "id": "21af5504-55bf-4326-951a-af51cd40a2f9",
            "symbol": "VTI",
            "name": "Vanguard Total Stock Market ETF",
            "asset_class": "Equity",
            "total_assets": "251632000000.00",
            "ytd_price_change": "15.20",
            "average_daily_volume": 3760095
        }
    ]
}

Fields were added to help pagination:

  • count: The amount of returned results;
  • next: The next page;
  • previous: The previous page;
  • results: The current result page.

There are several other very useful settings!

Here are some:

👉 DEFAULT_AUTHENTICATION_CLASSES is used to configure the API authentication method:

REST_FRAMEWORK = {
  ...
  DEFAULT_AUTHENTICATION_CLASSES: [
    'rest_framework.authentication.SessionAuthentication',
    'rest_framework.authentication.BasicAuthentication'
  ]
  ...
}

👉 DEFAULT_PERMISSION_CLASSES is used to set permissions needed to access the API (globally).

REST_FRAMEWORK = {
  ...
  DEFAULT_PERMISSION_CLASSES: ['rest_framework.permissions.AllowAny']
  ...
}

Note: It is also possible to define this configuration per View, using the attribute permissions_classes (which we use in our ExchangeTradedFundViewSet).

👉 DATE_INPUT_FORMATS is used to set date formats accepted by the API:

REST_FRAMEWORK = {
  ...
  'DATE_INPUT_FORMATS': ['%d/%m/%Y', '%Y-%m-%d', '%d-%m-%y', '%d-%m-%Y']
  ...
}

The above configuration will make the API allow the following date formats ’10/25/2006′, ‘2006-10-25′, ’25-10-2006’ for example.

See more settings accessing here the Documentation.

Frameworkless Web Applications

Since we have (mostly) advanced beyond CGI scripts and PHP the default tool many people reach for when building a web application is a framework. Like drafting a standard legal contract or making a successful Hollywood film, it’s good to have a template to work off of. A framework lends structure to your application and saves you from having to reinvent a bunch of wheels. It’s a solid foundation to build on which can be a substantial “batteries included” model (Rails, Django, Spring Boot, Nest) or a lightweight “slap together whatever shit you need outta this” sort of deal (Flask, Express).

Foundations can be handy.

The idea of a web framework is that there are certain basic features that most web apps need and that these services should be provided as part of the library. Nearly all web frameworks will give you some custom implementation of some or all of:

  • Configuration
  • Logging
  • Exception trapping
  • Parsing HTTP requests
  • Routing requests to functions
  • Serialization
  • Gateway adaptor (WSGI, Rack, WAR)
  • Authentication
  • Middleware architecture
  • Plugin architecture
  • Development server

There are many other possible features but these are extremely common. Just about every framework has its own custom code to route a parsed HTTP request to a handler function, as in “call hello() when a GET request comes in for /hello.”

There are many great things to say about this approach. The ability to run your application on any sort of host from DigitalOcean to Heroku to EC2 is something we take for granted, as well as being able to easily run a web server on your local environment for testing. There is always some learning curve as you learn the ins and outs of how you register a URL route in this framework or log a debug message in that framework or add a custom serializer field.

But maybe we shouldn’t assume that our web apps always need to be built with a framework. Instead of being the default tool we grab without a moment’s reflection, now is a good time to reevaluate our assumptions.

Serverless

What struck me is that a number of the functions that frameworks provide are not needed if I go all-in on AWS. Long ago I decided I’m fine with Bezos owning my soul and acceded to writing software for this particular vendor, much as many engineers have built successful applications locked in to various layers of software abstraction. Early programmers had to decide which ISA or OS they wanted to couple their application to, later we’re still forced to make non-portable decisions but at a higher layer of abstraction. My python or JavaScript code will run on any CPU architecture or UNIX OS, but features from my cloud provider may restrict me to that cloud. Which I am totally fine with.

I’ve long been a fan of and written about serverless applications on my blog because I enjoy abstracting out as much of my infrastructure as possible so as to focus on the logic of my application that I’m interested in. My time is best spent concerning myself with business logic and not wrangling containers or deployments or load balancer configurations or gunicorn.

We’ve had a bit of a journey over the years adopting the serverless mindset, but one thing has been holding me back and it’s my attachment to web frameworks. While it’s quite common and appropriate to write serverless functions as small self-contained scripts in AWS Lambda, building a larger application in this fashion feels like trying to build a house without a foundation. I’ve done considerable experimentation mostly with trying to cram Flask into Lambda, where you still have all the comforts of your familiar framework and it handles all the routing inside a single function. You also have the flexibility to easily take your application out of AWS and run it elsewhere.

There are a number of issues with the approach of putting a web framework into a Lambda function. For one, it’s cheating. For another, when your application grows large enough the cold start time becomes a real problem. Web frameworks have the side-effect of loading your entire application code on startup, so any time a request comes in and there isn’t a warm handler to process it, the client must wait for your entire app to be imported before handling the request. This means users occasionally experience an extra few seconds of delay on a request, not good from a performance standpoint. There are simple workarounds like provisioned concurrency but it is a clear sign there is a flaw in the architecture.

Classic web frameworks are not appropriate for building a truly serverless application. It’s the wrong tool for the architecture.

The Anti-Framework

Assuming you are fully bought in to AWS and have embraced the lock-in lifestyle, life is great. AWS acts like a framework of its own providing all of the facilities one needs for a web application but in the form of web services of the Amazonian variety. If we’re talking about RESTful web services, it’s possible to put together an extremely scalable, maintainable, and highly available application.

No docker, kubernetes, or load balancers to worry about. You can even skip the VPC if you use the Aurora Data API to run SQL queries.

The above list could go on for a very long time but you get the point. If we want to be as lazy as possible and leverage cloud services as much as possible then what we really want is a tool for composing these services in an expressive and familiar fashion. Amazon’s new Cloud Development Kit (CDK) is just the tool for that. If you’ve never heard of CDK you can read a friendly introduction here or check out the official docs.

In short CDK lets you write high-level code in Python, TypeScript, Java or .NET, and compile it to a CloudFormation template that describes your infrastructure. A brief TypeScript example from cursed-webring:

// API Gateway with CORS enabled
const api = new RestApi(this, "cursed-api", {
  restApiName: "Cursed Service",
  defaultCorsPreflightOptions: {
    allowOrigins: apigateway.Cors.ALL_ORIGINS,
  },
  deployOptions: { tracingEnabled: true },
});

// defines the /sites/ resource in our API
const sitesResource = api.root.addResource("sites");

// get all sites handler, GET /sites/
const getAllSitesHandler = new NodejsFunction(
  this,
  "GetCursedSitesHandler",
  {
    entry: "resources/cursedSites.ts",
    handler: "getAllHandler",
    tracing: Tracing.ACTIVE,
  }
);
sitesResource.addMethod("GET", new LambdaIntegration(getAllSitesHandler));

Is CDK a framework? It depends how you define “framework” but I consider more to be infrastructure as code. By allowing you to effortlessly wire up the services you want in your application, CDK more accurately removes the need for any sort of traditional web framework when it comes to features like routing or responding to HTTP requests.

While CDK provides a great way to glue AWS services together it has little to say when it comes to your application code itself. I believe we can sink even lower into the proverbial couch by decorating our application code with metadata that generates the CDK resources our application declares, specifically Lambda functions and API Gateway routes. I call it an anti-framework.

@JetKit/CDK

To put this into action we’ve created an anti-framework called @jetkit/cdk, a TypeScript library that lets you decorate functions and classes as if you were using a traditional web framework, with AWS resources automatically generated from application code.

The concept is straightforward. You write functions as usual, then add metadata with AWS-specific integration details such as Lambda configuration or API routes:

import { HttpMethod } from "@aws-cdk/aws-apigatewayv2"
import { Lambda, ApiEvent } from "@jetkit/cdk"

// a simple standalone function with a route attached
export async function aliveHandler(event: ApiEvent) {
  return "i'm alive"
}
// define route and lambda properties
Lambda({
  path: "/alive",
  methods: [HttpMethod.GET],
  memorySize: 128,
})(aliveHandler)

If you want a Lambda function to be responsible for related functionality you can build a function with multiple routes and handlers using a class-based view. Here is an example:

import { HttpMethod } from "@aws-cdk/aws-apigatewayv2"
import { badRequest, methodNotAllowed } from "@jdpnielsen/http-error"
import { ApiView, SubRoute, ApiEvent, ApiResponse, ApiViewBase, apiViewHandler } from "@jetkit/cdk"

@ApiView({
  path: "/album",
  memorySize: 512,
  environment: {
    LOG_LEVEL: "DEBUG",
  },
  bundling: { minify: true, metafile: true, sourceMap: true },
})
export class AlbumApi extends ApiViewBase {
  // define POST handler
  post = async () => "Created new album"

  // custom endpoint in the view
  // routes to the ApiViewBase function
  @SubRoute({
    path: "/{albumId}/like", // will be /album/123/like
    methods: [HttpMethod.POST, HttpMethod.DELETE],
  })
  async like(event: ApiEvent): ApiResponse {
    const albumId = event.pathParameters?.albumId
    if (!albumId) throw badRequest("albumId is required in path")

    const method = event.requestContext.http.method

    // POST - mark album as liked
    if (method == HttpMethod.POST) return `Liked album ${albumId}`
    // DELETE - unmark album as liked
    else if (method == HttpMethod.DELETE) return `Unliked album ${albumId}`
    // should never be reached
    else return methodNotAllowed()
  }
}

export const handler = apiViewHandler(__filename, AlbumApi)

The decorators aren’t magical; they simply save your configuration as metadata on the class. It does the same thing as the Lambda() function above. This metadata is later read when the corresponding CDK constructs are generated for you. ApiViewBase contains some basic functionality for dispatching to the appropriate method inside the class based on the incoming HTTP request.

Isn’t this “routing?” Sort of. The AlbumApi class is a single Lambda function for the purposes of organizing your code and keeping the number of resources in your CloudFormation stack at a more reasonable size. It does however create multiple API Gateway routes, so API Gateway is still handling the primary HTTP parsing and routing. If you are a purist you can of course create a single Lambda function per route with the Lambda() wrapper if you desire. The goal here is simplicity.

The reason Lambda() is not a decorator is that function decorators do not currently exist in TypeScript due to complications arising from function hoisting.

Why TypeScript?

As an aside, TypeScript is now my preferred choice for backend development. JavaScript no, but TypeScript yes. The rapid evolution and improvements in the language with Microsoft behind it have been impressive. The language is as strict as you want it to be. Having one set of tooling, CI/CD pipelines, docs, libraries and language experience in your team is much easier than supporting two. All the frontends we work with are React and TypeScript, why not use the same linters, type checking, commit hooks, package repository, formatting configuration, and build tools instead of maintaining say, one set for a Python backend and another for a TypeScript frontend?

Python is totally fine except for its lack of type safety. Do not even attempt to blog at me ✋🏻 about mypy or pylance. It is like saying a Taco Bell is basically a real taqueria. Might get you through the day but it’s not really the same thing 🌮

Construct Generation

So we’ve seen the decorated application code, how does it get turned into cloud resources? With the ResourceGeneratorConstruct, a CDK construct that takes your functions and classes as input and generates AWS resources as output.

import { CorsHttpMethod, HttpApi } from "@aws-cdk/aws-apigatewayv2"
import { Construct, Duration, Stack, StackProps, App } from "@aws-cdk/core"
import { ResourceGeneratorConstruct } from "@jetkit/cdk"
import { aliveHandler, AlbumApi } from "../backend/src"  // your app code

export class InfraStack extends Stack {
  constructor(scope: App, id: string, props?: StackProps) {
    super(scope, id, props)

    // create API Gateway
    const httpApi = new HttpApi(this, "Api", {
      corsPreflight: {
        allowHeaders: ["Authorization"],
        allowMethods: [CorsHttpMethod.ANY],
        allowOrigins: ["*"],
        maxAge: Duration.days(10),
      },
    })

    // transmute your app code into infrastructure
    new ResourceGeneratorConstruct(this, "Generator", {
      resources: [AlbumApi, aliveHandler], // supply your API views and functions here
      httpApi,
    })
  }
}

It is necessary to explicitly pass the functions and classes you want resources for to the generator because otherwise esbuild will optimize them out of existence.

Try It Out

@jetkit/cdk is MIT-licensed, open-source, and has documentation and great tests. It doesn’t actually do much at all and that’s the point.

If you want to try it out as fast as humanly possible you can clone the TypeScript project template to get a modern serverless monorepo using NPM v7 workspaces.

Woodworker Designs and Builds the Perfect Tiny House Boat called the Le Koroc
Maybe a foundation isn’t needed after all