Saturday 25 October 2014

Cut and Paste

I replied on Twitter to Grady Booch and Valentin T Mocanu earlier (Booch in turn was commenting on a Quora answer). Being Twitter it was a bit brief and slightly flippant. The thrust was "cut and past programming is fine" ... "fine for the first 2-3 years".
My contribution: using example code teaches what you ask of it.

I've learnt a lot with examples; I've taught a lot with examples. But, what I learnt by using examples was not how to touch type from a magazine, nor how to copy and paste from examples. Like most programmers that work with others, I look at what others write. Sometimes it gets me through a block. Sometimes I see something new. As I use it I mentally fit it to my existing models. Then I prod what I've got, see what breaks it, how it changes, what else the documentation says about this new thing.

When I use a new library or service I often reach for example code. OOP is very good at giving us generality: lots of classes, lots of methods. But the first use of something is usually pretty close to the common case. API documentation rarely describes that (many thanks to those that do!) and simplifying the common case sometimes gets lost in the quest for elegance. Examples show that path. Then, as before, documentation can be explored to vary from this. Exploration gives a structure to build a model from.

Testing, of course, is a great tool for doing this exploration, prodding an idea and seeing what happens. Asking "my code does X, but what I really want is Y, how do I get there?".

Blind cut and paste, of course, teaches next to nothing and doesn't solve all problems. You can get by on it, but it would be hard to tread a new path that way. I see this frequently, and in many ways it puzzles me. One of the great attractions of programming, to me as a kid, was that I had an infinite set of problems to solve to make the computer do something fun. And, unlike electronics which I also enjoyed, the chances of my programming breaking the computer in a way I couldn't fix were small. (Of course if you ask the 13 year old me you wouldn't get that as an answer!) But, some people get the fear when you suggest that they try something new with some example code. I don't think it is fear of breaking it that is the barrier really, but a self-perception of incompetence - much as is involved in shyness. A lack of belief that they have the mental tools to make that exploration, or that their tools aren't as sharp as others' tools. Leading people into sharpening those tools, and believing that they can do this is quite a lot of what teaching programming comes out as. Give people the confidence to have a go, some pointers and feedback, and a few principles for the next round of exploration and pretty soon you have Kolb's learning cycle.

The other part of my answer was "Also good for using libs that are short on principles themselves." Maybe more than slightly flippant. But, occasionally, I use some library or tool, and find myself wondering how it works. An example gets me started. But straying from the one true path leads to failure. Sometimes there's a principle which has been broken or hidden or isn't the principle I was hoping for. Sometimes it's all a bit ad-hoc, a "tool" extracted from a single project that hasn't yet evolved principles and general applicability away from the use-case of the original project. Sometimes there's a principle, but it is unevenly applied.

I'll finish with an example: Java's ImageIO Input and Output streams and the standard Input and Output streams. Getting an Image written to an output stream and passed onto an input stream elsewhere, without specifying writing to a file, really could to be easier. The ImageInputStream and ImageOutputStream don't extend InputStream and OutputStreams, indeed an ImageOutputStream is also an ImageInputStream while the non-Image versions don't relate like this. *NIX is deep enough in my blood that I want pipes to do both input and output and just work, but in this problem the streams are in the same place and Java's PipedStreams need mucking about with threads if there's much data (and plugging a piped pair of streams into an ImageOutputStream without extra threads certainly broke for me). So, here's my example (arrived at by seeing several other examples, relating to my mental model of images and I/O, exploring, adjusting and combining):
  
  ByteArrayOutputStream os = new ByteArrayOutputStream();
  doRender(data, os); 
    /* method arranges the rendering object with the right parameters leading to
       ImageIO.write(theRaster, imageType, outStream);  */
  byte[] imgBytes = os.toByteArray(); // you know where you stand with an array
  ByteArrayInputStream is = new ByteArrayInputStream(imgBytes); // clunk
  docStore.putData(data.getName(), RENDER_MIME_TYPE, is, imgBytes.length);


This works - the rendered file ends up on my docStore (S3 in this case). But there's a part of me that would really rather not see that array. My images are (already) 60-80kB. I'm aware that at some point memory use may prompt me to revisit this (but lets not over-engineer it yet).

So, there you are: an example for you, which I wrote (today as it happens) by learning from a mix of examples and API documents (some of that learning happened years ago, of course). Whether the example is necessary because I misunderstood the principles of the libraries, or because the libraries haven't best applied available principles I'll leave to you to decide!

Friday 10 October 2014

CouchDB Install

I've installed CouchDB a couple of times recently, for development VMs and then adding it to an existing Jenkins CI image now that my project has a test that needs it. This isn't as simple as "yum install..." yet, at least for the systems I've tried - although an Ubuntu package is available. Continuing the tradition of this blog documenting installations, below is a log of the Jenkins slave install, which follows what I did on my CentOS VMs but without the mistakes and dead-ends. This runs on an AWS EC2 t2.small instance, with a configuration including Amazon's 64bit Linux, Ninja Framework and PostgreSQL mostly described in previous posts. So, I assume you have a basic machine up and running.

First some package installs, mostly as advised by the documentation on the other packages to install, but starting with the inevitable update:

sudo yum update
sudo yum install autoconf curl-devel ncurses-devel openssl-devel
sudo yum install gcc gcc-c++ libicu-devel

Erlang is built from scratch: to get a current version and to avoid a crypto fail in some Linux distros (including CentOS which I have on my development VM, which allows erlang and couchdb to compile and install, but fail on starting couchdb). Couchdb is compiled against this later.

curl -O http://www.erlang.org/download/otp_src_17.3.tar.gz 
tar xzf otp_src_17.3.tar.gz
cd otp_src_17.3
export ERL_TOP=`pwd`
./configure
make
// make a cup of tea
sudo make install
cd ..

Next, we need Python, in particular the 2.7.x version, and SpiderMonkey, again not the newest one, but version 1.8.5:

curl -O https://www.python.org/ftp/python/2.7.8/Python-2.7.8.tgz
tar xzf Python-2.7.8.tgz
cd Python-2.7.8
./configure
make
sudo make install
cd ..
curl -O http://ftp.mozilla.org/pub/mozilla.org/js/js185-1.0.0.tar.gz
tar xzf js185-1.0.0.tar.gz
cd js-1.8.5/js/src
./configure
make
// start to drink tea
sudo make install
cd /etc/ld.so.conf.d
sudo vi local-x86_64.conf
and add the entry: /usr/local/lib, so that the js libraries are picked up.
sudo ldconfig

And now to CouchDB. Note that they have mirror sites, the one I used might not be the most appropriate for you.

sudo adduser --system --home /usr/local/var/lib/couchdb --no-create-home --shell /bin/bash --comment "CouchDB Administrator" couchdb
curl -O http://mirror.ox.ac.uk/sites/rsync.apache.org/couchdb/source/1.6.1/apache-couchdb-1.6.1.tar.gz
tar xzf apache-couchdb-1.6.1.tar.gz
cd apache-couchdb-1.6.1
make
// tea is about right now
sudo make install
cd /usr/local/var/run
sudo chown -R couchdb:couchdb couchdb/
sudo chmod 0770 couchdb/
cd ../log/
sudo chmod 0770 couchdb/
sudo chown -R couchdb:couchdb couchdb/
cd ../lib
sudo chown -R couchdb:couchdb couchdb/
sudo chmod 0770 couchdb/
cd ../../etc
sudo chown -R couchdb:couchdb couchdb/
sudo chmod 0770 couchdb/
cd rc.d
sudo cp couchdb /etc/rc.d/init.d
sudo service couchdb start
sudo chkconfig --levels 235 couchdb on

To check it is working, curl http://127.0.0.1:5984/ should respond with something like:

{"couchdb":"Welcome","uuid":"d420a7e9b585cb74039a54f1874b6a0f","version":"1.6.1","vendor":{"name":"The Apache Software Foundation","version":"1.6.1"}}

Finally, tidy up the download and build directories; make a new AMI; and point the Jenkins EC2 plugin's config for the slave at that AMI; and test. For the purposes of a short-lived CI slave whose firewall only talks ssh password protecting access with new admin users is probably overkill. However, for my application there is a database creation script which makes some assumptions about admin users; and some test cases which assume a test database. So, in the Jenkins config for the job pre-steps we have:

curl -sS -X PUT http://localhost:5984/_users/org.couchdb.user:dan -H "Accept: application/json" -H "Content-Type: application/json" -d '{"name": "dan", "password": "mumblemumble", "roles": [], "type": "user"}'
curl -sS -X PUT http://dan:mumblemumble@localhost:5984/test
dbConfig/create-db.couch

And observe the test passing!
For completeness, that first test is below. It doesn't do anything terribly clever: posts a new document on CouchDB and gets it back. However, it re-uses the NinjaProperties fake idea from my previous post about fakes and injection, so completes two loops!
package models;

import java.util.HashMap;
import java.util.Map;
import java.util.UUID;
import models.PersistCouchDB.IDRev;
import org.junit.AfterClass;
import org.junit.Test;
import static org.junit.Assert.*;
import org.junit.BeforeClass;

/**
 * @author Dan Chalmers
 */
public class PersistCouchDBTest {
    
  public static final String DB_NAME = "test";  // note, this needs to be created on CouchDB in the test environment!

    static PersistCouchDBStub db;
    
    @Test
    public void test_postNewDoc() throws BrokenSystemException {
      HashMap map;
      String randomString;
      IDRev postResult;
      Map getResult;

      map = new HashMap();
      map.put("testing", "1two3");
      randomString = UUID.randomUUID().toString();
      map.put("random", randomString);
      postResult = db.postNewDoc(DB_NAME, map);
      assertNotNull(postResult.id);

      getResult = db.getData(DB_NAME, postResult.id);
      assertEquals("1two3", getResult.get("testing"));
      assertEquals(randomString, getResult.get("random"));
    }

   static class PersistCouchDBStub extends PersistCouchDB {
     /* 
     PersistCouchDB is abstract as it just 
       - reads the relevant Ninja configuration 
       - and contains helper methods to arrange the basic access patterns.
     This works, but doesn't do anything very helpful with real data.
    */
  }

  @BeforeClass
  public static void createDBObject() throws BrokenSystemException {
    FakeNinjaProperties props;

     props = new FakeNinjaProperties();
     props.put("couchdb.readserver", "localhost:5984");
     props.put("couchdb.writeserver", "localhost:5984");
     props.put("couchdb.connection.username", "tripvis");
     db = new PersistCouchDBStub();
     db.mockLoadProps(props);
     db.mockCreateJsonMapper();
   }
}