This is one of the engagements I look back on most fondly, and the reason is that it was a deceptively simple build that ended up doing real scientific work in the world. The project was POOL — the Pedigree of Oat Lines — a database and web tool built in 2008 for Agriculture and Agri-Food Canada, the federal department whose Central Experimental Farm in Ottawa has been the heart of Canadian agricultural research since the 1880s. POOL was for the scientists working on oat genetics: a place to put research photographs and field observations, and a way to find what other researchers around the world had observed about the same thing.
Tags instead of folders
The technical bet was simple, and at the time it was not yet the obvious choice. Earlier research tools in this lineage organised information hierarchically — variety, then year, then trial, then observation — the way a paper filing cabinet would have. That structure is fine if you already know exactly where the thing you are looking for lives. It becomes a wall the moment someone in one country needs to find what a colleague in another country observed about a problem they are both staring at for the first time.
POOL stepped sideways. Researchers uploaded a photograph or a research note, and instead of filing it into a folder tree, they tagged it with the attributes that mattered: the variety, the symptom they were observing, the geography, the year, the trial conditions, whatever discriminating terms the working vocabulary of oat genetics gave them. Behind the scenes the build was unglamorous — PHP, MySQL, tables for tags and tables joining tags to records, the kind of architecture a competent developer in 2008 could build in a few weeks rather than the kind of formal ontology platform that would have required a research grant of its own. What computer science was starting to call an ontology approach to data, built simply enough that the scientists could actually maintain it themselves.
The moment the architecture earned its keep
The clearest example of why the tag-driven structure mattered came when a researcher in Canada uploaded photographs of a rust outbreak they were investigating in an oat trial. Within minutes of the upload — through nothing more clever than the same tags being applied to other records in the system — the tool surfaced research from a Japanese colleague who had documented something with the same symptom profile months earlier. The Canadian scientist had access to context that, in the hierarchical system, would have required either a literature search through journals that had not yet published, or the long way around through a phone call to a colleague who happened to know somebody who happened to remember reading something.
Faster cross-discovery meant faster cause-determination, which meant the scientists could move on to the part of the work that actually needed human scientific judgement rather than spending the first week of an outbreak investigation discovering the outbreak was not as novel as it had looked. That is the job of a research tool — to give the scientists back the time the structure of an older tool was quietly taking from them.
Where this pattern transfers
The principle this build was an early instance of has had a very long second life. Any system whose primary purpose is to help a user answer “what is this related to” rather than “where does this live” benefits from the same shape. Medical-image research databases where a clinician needs to find prior cases that share a presentation rather than prior cases filed under the same patient. Field-observation archives in conservation biology and entomology. Industrial-failure case banks where the question is “where else have we seen this symptom,” not “which plant did this report come from.” Cross-jurisdiction public-health systems. Anything where the value of the archive is in the connections between records, and the cost of the wrong structure is researcher time that nobody is ever going to get back.
- The build: Tag-driven research database and web tool for oat-genetics photography and field observations; many-to-many tag-to-record architecture; cross-discovery search
- Stack: PHP and MySQL
- Period: 2008
- Client: Agriculture and Agri-Food Canada — Central Experimental Farm, Ottawa
POOL led to an invitation to present the work to an internationally recognised panel of expert reviewers at the Central Experimental Farm in Ottawa — the kind of room a developer rarely gets to stand in, where the audience knows more about the scientific subject matter than the developer ever will and is interested only in whether the tool actually serves the science. The answer that day, on the strength of the cross-discovery work the researchers were already doing with it, was that it did.