How to complete a side project and turn it into a business (Level 1)

Almost four years ago I wrote about Lessons learned: A failed side project and when I stumbled across the post How to *never* complete anything several days ago I felt it was time for an update. Ewan’s post covers similar lessons learned that I wrote about in my post and I heard about many side projects that failed because of similar mistakes.

There was an argument on HackerNews what a side project is, so I want to clarify this up front. I am talking about side-projects written with some intend of turning them into a business, as opposed to side-projects that are started for learning new technical skills.

Three years ago I have started working on a PHP profiling and monitoring side-project, which I turned into a business called Tideways. In this post I wanted to share some of the reasons why I think this project was successful.

The idea for “XHProf as a Service” was on my “Side Project Ideas” trello board, where I write everything down that I want to try or experiment with. I have picked it out for implementation as my new side project, because that month I would have needed this for a large scale load-testing project at my job with Qafoo. It was not the first time I needed this for myself, I felt a regular pain that a product like this didn’t exist.

Not wanting to make the same mistakes, I have applied all the lessons learned from 2013:

  • I picked a small scope: The first version contained very few features, re-used the existing xhprof PHP extension (Tideways now has its own) and instead of a daemon collecting data in realtime used a cronjob to collect data from a directory every minute. After six months of development I removed half of the features I experimented with to reduce the scope again.
  • I did not compete with the established competition on features, instead I focussed on two single features that I thought where the most important: Response time monitoring and Callgraph profiling. By focussing on a niche (PHP intead of all languages) I was able to provide a better solution than existing generic tools (obviously biased).
  • I did not work all alone on the project, instead I have immediately enlisted alpha-users that gave valuable feedback after 1 month of work and some of them are still our most active customers; brought in my employer Qafoo who is now a shareholder in the new business; formed business partnerships with companies that are now our biggest customers and re-sellers.
  • I did not keep the idea secret, when I announced our beta program publically in June 2014 the launch newsletter list received 250 sign ups and 60 Twitter followers in a few hours.
  • I choose boring technology, using a monolithic Symfony, PHP, MySQL backend and jQuery frontend stack allowed me to iterate very fast on new features with technology that I already mastered. I spent three innovation tokens on using Golang for the daemon, using Elasticsearch as TimeSeries database and later on learning C for the PHP extension.

Making these decisions has not magically turned my side project into a profit-generating business. However I have avoided all the problems that I have written about in my failed side project lessons post from four years ago.

The fast iterations on a small scope combined with early user feedback showed that this idea will work as a business and it was worthwhile to keep pushing for two years, until the project generated enough revenue after all other costs to pay my full time salary.

This felt like passing level 1 in getting a bootstrapped side project of the ground (there are many more levels after that).

More about: SideProjects / Bootstrapping

Explicit Global State with Context Objects

Global State is considered bad for maintainability of software. Side effects on global state can cause a very nasty class of bugs. Context objects are one flavour of global state. For example, I remember that Symfony1 had a particularly nasty context object that was a global singleton containing references to very many services of the framework.

As with every concept in programming, there are no absolute truths though and there are many use-cases where context objects make sense. This blog posts tries to explain my reasons for using context objects.

What is a Context?

Context is defined as all the information and circumstances in which something can be fully understood. In daily programming this is mostly related to the state of variables and databases. Some examples in the world of PHP include the superglobals $_GET and $_POST.

Any piece of code is always running in some context and the question is how much of it is explicit in the code and how much is hidden.

A context object is a way to make the existing context explicit for your code.

Context Objects

Lets take a look at the Definition of a context object

A context object encapsulates the references/pointers to services and configuration information used/needed by other objects. It allows the objects living within a context to see the outside world. Objects living in a different context see a different view of the outside world.

Besides the obvious use of encapsulating services and config variables, this definition mentions two important properties:

  1. Allows to see the outside world, which does not mean it can change it. In my opinion it is essential that context objects are immutable, to avoid side effects.
  2. The possibility of objects living in different contexts, seeing different context objects suggets that a context object should never be a singleton.

By using objects instead of global variables for context, we can use encapsulation to achieve immutability.

Context already exists in PHP through the existance of superglobals. Frameworks usually wrap them to achieve the properties mentioned above: Immutability and Not-Singleton.

The Symfony Request object is one good example, where these properties (almost) hold. Wrapping the superglobals with this object even allowed creating Subrequests inside PHP requests.

Application Context

Now that we have defined context we should make use of context objects in your applications.

Anything that is an important global information in your application is relevant for the context:

  1. Building a wiki? I actually have the idea for this approach from Fitnesse, a testing tool based on the wiki idea maintained by Uncle Bob and his team. Their context object provides access to the current and the root page nodes.
  2. Building a shop? The users basket id, selected language/locale, current campaign (newsletter, google, social media?) can be information that should be available all the time.
  3. Building an analytics software? The currently selected date/time-range is probably important for all queries.
  4. Building a CMS/blog? The current page/post-id, the root page id, user language/locale seem to be good candidates for an application context. Wordpress does this although their context is global and encapsulated in an object.
  5. Building a multi-tenant app? The tenant-id and configuration for this tentant (selected product plan, activated features, ...) are good candidates for the context.

Real World Example: Context in Tideways

How to introduce such a context? We could create an object in our application, for example how we did it in Tideways with selected tenant (organization), application, date-range and server environment:

<?php

class PageContext
{
    /**
     * @var \Xhprof\Bundle\OrganizationBundle\Entity\Organization
     */
    private $organization;

    /**
     * @var \Xhprof\Bundle\OrganizationBundle\Entity\Application
     */
    private $application;

    /**
     * @var \Xhprof\Common\Date\DateRange
     */
    private $selectedDateRange;

    /**
     * @var \Xhprof\Bundle\ProfilerBundle\View\EnvironmentView
     */
    private $selectedEnvironment;

    // constructor and getters
}

This object is created during request boot, in my case with a framework listener. The listener checks for access rights and security constraints, showing the 403/access denied page when necessary. This make 90% of access control checks unneeded that are usually cluttering the controller.

The context is then made available for the application by using a Symfony parameter converter, every controller-action can get access to the context by type-hinting for it:

<?php

class ApplicationController
{
    public function showAction(PageContext $pageContext)
    {
        return array('application' => $pageContext->getApplication());
    }
}

The beauty of this approach is avoiding global state and passing the context around in a non-singleton way. Depending on the framework you use, it might be hard to achieve this kind of context injection.

Now when I build lightweight Symfony2 controllers in my applications, using a context object allows me to use even less services and move repetitive find and access control code outside of the controllers.

I have also written a Twig extension that gives me access to the context object, so I don’t have to return it from every controller and created a wrapper for the URL Generation that appends context information to every URL (current date range + environment):

<h1>{{ pageContext.application.name }}</h1>

<a href="{{ page_path("some_route") }}">Link with Context query arguments</a>

Conclusion

A context object can help you make global state explicit and control access to it. Good requirements for a context object are immutability and not being a singleton.

When used correctly this pattern can save you alot of redundant code and simplify both controllers and views massively.

The pattern has its drawbacks: You have to be careful not put too powerful objects into the context and if you can modify the context, then you will probably introduce nasty side effets at some point. Additionally if you don’t make sure that creating the context is a very fast operation then you will suffer from performance hits, because the context is created on every request, maybe fetching expensive data that isn’t even used.

Composer Monorepo Plugin (previously called Fiddler)

I have written about monorepos in this blog before, presented a talk about this topic and released a standalone tool called “Fiddler” that helps integrating Composer with a monolithic repository.

At the beginning of the year, somebody in #composer-dev IRC channel on Freenode pointed me in the direction of Composer plugins to use with Fiddler and it was an easy change to do so.

With the help of a new Composer v1.1 feature to add custom commands from a plugin, Fiddler is now “gone” and I renamed the repository to the practical beberlei/composer-monorepo-plugin package name on Github. After you install this plugin, you have the possibility to maintain subpackages and their dependencies in a single repository.

$ composer require beberlei/composer-monorepo-plugin

To use the plugin add monorepo.json files into each directory of a subpackage and use a format similar to the composer.json to add dependencies to a.) external composer packages that you have listed in your global Composer file b.) other subpackages in the current monorepo. See this example for a demonstration:

{
    "deps": [
        "vendor/symfony/http-foundation",
        "components/Foo"
    ],
    "autoload": {
        "psr-0": {"Bar": "src/"}
    }
}

This subpackage here defined in a hypothetical file components/Bar/monorepo.json has dependencies to Symfony HTTP foundation and another subpackage Foo with its own components/Foo/monnorepo.json. Notice how we don’t need to specify versions (they are implicit) and import other dependencies using the relative path from the global composer.json.

The monorepo plugin is integrated with Composer, so every time you perform install, update or dump-autoload commands, the subpackages will be updated as well and each get their own autoloader that can be included from vendor/autoload.php relative to the subpackages root directory as usual.

More about: Monorepos / Composer

How I use Wordpress with Git and Composer

I maintain two Wordpress blogs for my wife and wanted to find a workflow to develop, update, version-contol and maintain them with Git and Composer, like I am used to with everything else that I am working on.

The resulting process is a combination of several blog posts and my own additions, worthy of writing about for the next person interested in this topic.

It turns out this is quite simple if you re-arrange the Wordpress directory layout a little bit and use some fantastic open-source projects to combine Wordpress and Composer.

Initialize Repository

As a first step, create a new directory and git repository for your blog:

$ mkdir myblog
$ cd myblog
$ git init

Create a docroot directory that is publicly available for the webserver:

$ mkdir htdocs

Place the index.php file in it that delegates to Wordpress (installed later):

<?php
// htdocs/index.php
// Front to the WordPress application. This file doesn't do anything, but loads
// wp-blog-header.php which does and tells WordPress to load the theme.

define('WP_USE_THEMES', true);
require( dirname( __FILE__ ) . '/wordpress/wp-blog-header.php' );

Create the wp-content directory inside the docroot, it will be configured to live outside the Wordpress installation.

$ mkdir htdocs/wp-content -p

And then create a .gitignore file with the following ignore paths:

/htdocs/wordpress/
/htdocs/wp-content/uploads
/htdocs/wp-content/plugins
/htdocs/wp-content/themes

If you want to add a custom theme or plugin you need to use git add -f to force the ignored path into Git.

Don’t forget to include the uploads directory in your backup, when deploying this blog to production.

You directory tree should now look like this:

.
├── .git
├── .gitignore
└── htdocs
    ├── index.php
    └── wp-content

In the next step we will use Composer to install Wordpress and plugins.

Setup Composer

Several people have done amazing work to make Wordpress and all the plugins and themes on Wordpress.org available through Composer. To utilize this work we create a composer.json file inside our repository root. There the file is outside of the webservers reach, users of your blog cannot download the composer.json.

{
    "require": {
        "ext-gd": "*",
        "wpackagist-plugin/easy-media-gallery": "1.3.*",
        "johnpbloch/wordpress-core-installer": "^0.2.1",
        "johnpbloch/wordpress": "^4.4"
    },
    "extra": {
        "installer-paths": {
            "htdocs/wp-content/plugins/{$name}/": ["type:wordpress-plugin"],
            "htdocs/wp-content/themes/{$name}/": ["type:wordpress-theme"]
        },
        "wordpress-install-dir": "htdocs/wordpress"
    },
    "repositories": [
        {
            "type": "composer",
            "url": "http://wpackagist.org"
        }
    ]
}

This Composer.json is using the execellent Wordpress Core Installer by John P. Bloch and the WPackagist project by Outlandish.

The extra configuration in the file configures Composer for placing Wordpress Core and all plugins in the correct directories. As you can see we put core into htdocs/wordpress and plugins into htdocs/wp-content/plugins.

Now run the Composer install command to see the intallation output similar to the next excerpt:

$ composer install
Loading composer repositories with package information
Installing dependencies (including require-dev)
  - Installing composer/installers (v1.0.23)
    Loading from cache

  - Installing johnpbloch/wordpress-core-installer (0.2.1)
    Loading from cache

  - Installing wpackagist-plugin/easy-media-gallery (1.3.93)
    Loading from cache

  - Installing johnpbloch/wordpress (4.4.2)
    Loading from cache

Writing lock file
Generating autoload files

The next step is to get Wordpress running using the Setup Wizard.

Setup Wordpress

Follow the Wordpress documentation to setup your Wordpress blog now, it will create the neccessary database tables and give you wp-config.php file to download. Copy this file to htdocs/wp-config.php and modify it slightly, it is necessary to adjust the WP_CONTENT_DIR, WP_CONTENT_URL and ABSPATH constants:

<?php

// generated contents of wp-config.php, salts, database and so on

define('WP_CONTENT_DIR',    __DIR__ . '/wp-content');
define('WP_CONTENT_URL',    WP_HOME . '/wp-content');

/** Absolute path to the WordPress directory. */
if ( !defined('ABSPATH') ) {
    define('ABSPATH', dirname(__FILE__) . '/wordpress');
}

/** Sets up WordPress vars and included files. */
require_once(ABSPATH . 'wp-settings.php');

Voila. You have Wordpress running from a Git repository and maintain the Wordpress Core and Plugins through Composer.

Different Development and Production Environments

The next step is introducing different environments, to allow using the same codebase in production and development, where the base urls are different, without having to change wp-config.php or the database.

Wordpress relies on the SITEURL and HOME configuration variables from the wp_options database table by default, this means its not easily possible to use the blog under http://myblog.local (development) and https://myblog.com` (production).

But working on the blog I want to copy the database from production and have this running on my local development machine without anything more than exporting and importing a MySQL dump.

Luckily there is an easy workaround that allows this: You can overwrite the SITEURL and HOME variables using constants in wp-config.php.

For development I rely on the built-in PHP Webserver that is available since PHP 5.4 with a custom router-script (I found this on a blog a long time ago, but cannot find the source anymore):

<?php
//htdocs/router.php

$root = $_SERVER['DOCUMENT_ROOT'];
chdir($root);
$path = '/'.ltrim(parse_url($_SERVER['REQUEST_URI'])['path'],'/');
set_include_path(get_include_path().':'.__DIR__);

if(file_exists($root.$path)) {
    if(is_dir($root.$path) && substr($path,strlen($path) - 1, 1) !== '/') {
        $path = rtrim($path,'/').'/index.php';
    }

    if(strpos($path,'.php') === false) {
        return false;
    } else {
        chdir(dirname($root.$path));
        require_once $root.$path;
    }
} else {
    include_once 'index.php';
}

To make your blog run flawlessly on your dev machine, open up htdocs/wp-config.php and add the following if statement to rewrite SITEURL and HOME config variables:

<?php
// htdocs/wp-config.php

// ... salts, DB user, password etc.

if (php_sapi_name() === 'cli-server' || php_sapi_name() === 'srv') {
    define('WP_ENV',        'development');
    define('WP_SITEURL',    'http://localhost:8000/wordpress');
    define('WP_HOME',       'http://localhost:8000');
} else {
    define('WP_ENV',        'production');
    define('WP_SITEURL',    'http://' . $_SERVER['SERVER_NAME'] . '/wordpress');
    define('WP_HOME',       'http://' . $_SERVER['SERVER_NAME']);
}

define('WP_DEBUG', WP_ENV === 'development');

You can now run your Wordpress blog locally using the following command-line arguments:

$ php -S localhost:8000 -t htdocs/ htdocs/router.php

Keep this command running and visit localhost:8000.

More about: Wordpress / Deployment

Monolithic Repositories with Composer and Relative Autoloading

Just was reminded on Twitter by Samuel that there is a way for monolithic PHP repositories with multiple components that I haven’t mentioned in my previous post.

It relies on a new composer.json for each component and uses the autoloading capabilities of Composer in a hackish way.

Assume we have two components located in components/foo and components/bar, then if bar depends on foo, it could define its components/bar/composer.json file as:

{
    "autoload": {
        "psr-0": {
            "Foo": "../foo/src/"
        }
    }
}

This approach is very simple to start with, however it has some downsides you must take into account:

  • you have to redefine dependencies in every composer.json that relies on another component.
  • if foo and bar depend on different versions of some third library baz that are not compatible, then composer will not realize this and your code will break at runtime.
  • if you want to generate deployable units (tarballs, debs, ..) then you will have a hard time to collect all the implicit dependencies by traversing the autoloader for relative definitions.
  • A full checkout has multiple vendor directories with a lot of duplicated code.

I think this approach is ok, if you are only sharing a small number of components that don’t define their own dependencies. The Fiddler approach however solves all these problems by forcing to rely on the same dependencies in a project globally and only once.

More about: Monorepos