Friday 15 April 2016

Internet of Things and The Cloud

The Internet of Things (IoT) got some bad press as a concept a couple of weeks back, as a result of the plug being pulled on Revolv's Nest hub, e.g. Wired, BBC, The Guardian. And indeed, when I buy a "thing" I expect it to just work - and for some time. Thermostats, light switches, smoke alarms and kitchen-appliances are all things that I expect to leave working when I move house in some years; lightbulbs less so, but I don't expect all of them to break at once because the light fitting or the wires have become obsolete. And that is the issue here: the controllers still work just fine - but the cloud service that they require to work is being switched off.

The added cost of being an early adopter probably comes with an expectation that the product may date faster than if I wait, or require more frequent fixes and upgrades - as is the case if you compare my FitBit to my old analogue watch. In any case, this story raises a wider question of what happens to IoT tech as the market evolves. As someone that's worked in "pervasive computing" and IoT for some time this is naturally of interest to me.

There's a number of potential solutions, and their accompanying issues, that have come past in news articles and Twitter, including:

  • Use products with standards, assuming that the idea is mature enough to have a standard. A feature of innovation is that at least part of the idea is often ahead of the standards curve.
  • Provide a refund and ease people onto a new system, assuming they are willing to invest in that on the back of their previous investment being a dud.
  • Sell or open source the cloud software so others can keep it going to service the device. However, in most cases this software may contain some significant IP that the company probably wants to sell or reuse in a pivot of their idea. If open source was the right solution it would probably have been part of the picture well in advance of pulling the product.

So, what to do?

My suggestion is that the IoT is a good example of a system where "cloud" services supporting the devices would benefit from a different approach: 
It isn't the cloud, it's a cloud.
By which I mean: don't sell me a device that connects to some centralised cloud service - unless you're really small scale and alpha-testing. Sell me something that brings up its own cloud service, maybe with a provider I choose, and that I get billed for. It can let my phone know where to look so they can talk - it will just be a URL and the physical thing can bootstrap disseminating this.

Maybe this is a systems architect's solution, and certainly feels obvious if you cast this as a distributed systems problem, but I can see several advantages:
  • If the vendor switches their systems off mine can keep going. If there's some advantage to them aggregating data from many customers that can happen by forwarding on from my cloud to theirs - and fail cleanly if the need changes. The crucial step being to eliminate centralised nodes in the control loop.
  • The home systems can point at a repo with appropriate security to manage centralised updates. There's no particular need to give me a login to the cloud service, so the secret sauce isn't much more exposed than it would be when centralised - especially given that one end of it is hardware in my possession. 
  • By putting my data on a system which I control I can be given greater and more plausible control over my privacy. My detailed data is visible to fewer people and a hacker has to gain access to many systems to gather large amounts of data.
  • By distributing the system over many little services scalability is easy and system failures affect fewer users.

Wednesday 20 January 2016

make-ing docker

I mentioned some time ago that I've been exploring Docker. I then changed my main job and the blog went rather quiet. I have a contract that gives me some time and IP to myself, and this is finally turning into a new project. Between personal projects and work I've spent a whole lot more time with Docker. So, a blog post, about building Docker images, that might be useful to someone...

A bit of background:
First, I'm a fan of make. I know, it shows my age. But, it's really quite good at handling building tasks with a minimum of tricky requirements on the build system. It can also be turned to organising installations and running jobs, so keeping related issues in one version-controllable place. It is also in a format that's quite user friendly, even when ssh-ed into a server and fixing stuff with vi. So, while I've also used Ant, Ansible and Maven to do some of these things, it remains a reliable standby.

One of the things I use Docker for is setting up groups of images which are related to each other: django; django with a different config; django with that config, but setup for running tests rather than running the server - and so on. Which leads to dependencies.

Most of the time when I use Docker, its in a local environment, and I just build and run images on one machine - without using Docker Hub. I like having the tool chain version controlled, and this approach fits.

Make is good at handling dependencies, but the relationship chains in Docker image definitions don't expose themselves in file names (unless you're very organised with naming schemes). The quick first approach is to make them with explicit pointers in the make file as well as in the Dockerfile. But eventually that level of duplication will irk. So, in a spare couple of hours to polish my build scripts, I refactored the duplication out of the Makefile.

The key parts of the makefile are below - stripped of my builds to show the principle:

DIRS := $(shell find . -mindepth 1 -maxdepth 1 -type d)
DOCKERFILES := $(addsuffix /Dockerfile,$(DIRS))
IMAGES := $(subst /,,$(subst ./,,$(dir $(DOCKERFILES))))
FLAG_FILES := $(addprefix ., $(addsuffix .docker, $(IMAGES)))
PWD := $(shell pwd)


# Docker images can depend on each other.
# A changed base image ought to trigger a rebuild of its children.
define image_dep_search
@echo "checking dependencies of $1"
@for d in $(IMAGES); do \
 from=`grep FROM $$d/Dockerfile | cut -d ' ' -f 2`; \
 if [ $1 = $$from ]; then \
  echo "dependent image $$d"; \
  touch $$d; \
  make .$$d.docker; \
 fi\
done
endef


all: images 


# Consider all docker image directories for building
images: $(FLAG_FILES)
 @echo "Done making images."

# Build images where the directory or contents have changed since flag last set
.%.docker: % %/* 
 $(eval IMAGE = $(subst .,,$(basename $@)))
 $(eval BASE = $(word 2,$(shell grep FROM $(addsuffix /Dockerfile,$(IMAGE)))))
 $(eval HAS_DEP = $(filter $(BASE),$(IMAGES)))
 @echo "building $(IMAGE)"
 @cd $(IMAGE) && docker build -t $(IMAGE) .
 @touch $@
 $(call image_dep_search,$(IMAGE))


# Utility make targets for creating containers from images 
.PHONY: run_java_bash
run_java_bash: .java_base.docker java_bash_container

.PHONY: java_bash_container
java_bash_container:
 docker run --rm -v=$(PWD)/..:/project -it --name java_bash java_base bash

clean:
 @rm -f $(FLAG_FILES)

In order, this contains:
  1. Some definitions, which find directories that contain Dockerfiles. Files called ".<imgname>.docker" will be created to mark the latest build.
  2. A definition to use later, that finds the FROM line in the Dockerfile; extracts the argument; sees whether it is one of our images; fiddles the need for a build and calls make on that image.
  3. The standard make stuff, to run a build on each image directory which has changes. Once built any dependent images are found and built using the routine defined above.
  4. A phony target to run the container, to illustrate the point. It double checks the image, in case we're forgetful about running "make all" first. This provides the project root (the parent of the docker directory) as a mounted volume - which may or may not be a good thing, depending on your use case.
  5. A clean target, that gets rid of the build flag files. 

There's a couple of assumptions here:
  1. A directory of docker images within the project. I usually call it "docker". This is the set of docker images that will be considered for dependencies. The rest are just assumed to exist. The makefile is in this docker directory.
  2. The FROM and the image name in the Dockerfile are separated by a space.
  3. Images are flat directories. I think that structure is probably better put elsewhere, and built in an archive format than copying lots of files one by one in the Docker build process - so this hasn't been an issue for me.
  4. If one image depends on having another already built then another makefile rule is needed to force correct order. I'll update this when I've got an update that automates this as well. Simple "list of target" rules may well be needed to build specific sub-sets anyway.
And that's it: build the images same as I build the code, with a minimum of drag on my effort. Easy to call from Jenkins, easy to call from the command line.

Addendum: With a bit of tidying up, and some example Dockerfiles, this is now on GitHub at https://github.com/danchalmers/MakeingDocker