What’s under the bonnet in CanalPlanAC: Part 2 – data formats

Over the years CanalPlanAC has accumulated a host of data formats for various purposes. At least two – a custom C-like syntax (surprisingly similar to another I’ve adopted!) and “quick load” place and link files – are entirely obsolete but support for them still lurks in the interpreter.

Here are the main ones that anyone playing with the system might come across:

  • SQL – since summer 2008 the main database for CanalPlanAC, the users, the information about photographs, some of the program flow control and state information during a session have been stored in SQL databases. I’m using SQLite – this is small and can be directly integrated into the interpreter making setup very easy.
  • XML – until then XML was used for the master data file, and quick-read varieties in a sort of unstructured text file were generated. The unstructured files are also obsolete (although some support may still exist in route.c until I rip it out) but XML is still used although much more round the edges these days. In particular, it’s used for:
    • Sending data to some Javascript utilities.
    • Storing the configuration file /cgi-bin/config.xml which tells the program where all the data should be stored and similar things
    • To store the default user options in a file, and the user options (in a blob) in the user SQL database
    • To define options for gazetteer plugins (to be described some other time!)
  • JSON – not used statically, but used for a number of inter process and code communications (particularly those involving Javascript)
  • HTML – static pages are written in HTML
  • CST – ClearSilver Templates. Since summer 2007 pages are generated from template files. This is an extended form of HTML with ClearSilver directives built into it – see the website for more.
  • HDF – this is the configuration file language for ClearSilver. I didn’t have to use this, I’ve integrated ClearSilver so tightly I can easily pass it any format I like, but it seemed sensible. The most important HDF file is /templates/default/config.hdf which controls all the menus and buttons on all the pages.

So there you are. A pile of formats, but many of them widely known, and many of them only used in a few places. In the next few articles I’ll show how to add a sophisticated new feature (the Virtual Cruise) to CanalPlanAC and you’ll see how these various languages and formats are used.

What’s under the bonnet in CanalPlanAC: Part 1 – programming languages

OK, at the moment no-one is developing the program apart from myself. But it struck me that if I documented a few things as if people were, then I’d have a lot of stuff ready if ever I get a co-developer or even hand on the project.

The Canal Planner situation has, I think, shown us the danger of a project like this being all in the mind of one individual.

So what I’ll be doing from time to time is documenting how I did particular things, in a sort of “tutorial” way.  But before I do that, I thought I’d start with a sort of apology for the number of languages (big and small) used in CanalPlanAC.  I thought it might be fun to document them all – but don’t let this put anyone off: you don’t need all of them at any one time.

  • C – the Interpreter that underlies CanalPlanAC is written in C.  This can remain unchanged for months, until I discover a bug or find the need to add a feature right into the language.  What tends to happen is that I write something in the scripting langauge but find it too slow or too cumbersome, so add it as a language feature instead.    There is – as yet – no modular extensibility to the C: there are about 35 thousand lines of C in 87 source files that I’ve written,and a chunk more in various pieces of “foreign” code that are linked in.
  • CanalPlanAC scripting language.  I have no name for this language, which can be invoked interactively by /cgi-bin/canal/ or to execute a file as /cgi-bin/canal file.can.  This is a basic like language that deserves another dozen items on the syntax.  It was vaguely inspired by Superbasic which could give you a few pointers to some of the syntax.  Key things about it are:
    • An intermediate typing system where all values are one of a number of types, and get coerced from one to the other when assigned.
    • Support for associative arrays (called “lookup” tables) and lists
    • Inline operators: this is a mixed blessing – a line like «if a contains “hello” then …» is easy to read, but precedence sometimes goes wrong.  (So «print ‘this is me and you’ after ‘me’ contains ‘you’» produces the whole string (because it parses as evalutate «’me’ contains ‘you’» which is 0, then returns the original string after character 0.
    • Built-in comprehension of a waterways network with the ability to plan routes aroung it and travel along them.
    • Built-in read/write of SQL, JSON and XML
  • There are a few small bash scripts, mainly to kick off CanalPlan programs.
  • There is the whole automake/autoconf nest of snakes in /source/ but that shouldn’t need touching.
  • Many web pages use Javascript (both home written and some appropriately licensed utilities) to do place name lookup, display maps etc.

Next – the data formats that these programs use to store data, communicate with them selves and talk to the user

OK, at the moment no-one is developing the program apart from myself.

But it struck me that if I documented a few things as if people were,

then I’d have a lot of stuff ready if ever I get a co-developer or

even hand on the project.

The Canal Planner situation has, I think, shown us the danger of a

project like this being all in the mind of one individual.

So what I’ll be doing from time to time is documenting how I did

particular things, in a sort of “tutorial” way.  The first thing to do

is to start with an apology for the number of languages (big and

small) used in CanalPlanAC.  I thought it might be fun to document

them all – but don’t let this put anyone off: you don’t need all of

them at any one time.

*C – the Interpreter that underlies CanalPlanAC is written in C.  This

can remain unchanged for months, until I discover a bug or find the

need to add a feature right into the language.  What tends to happen

is that I write something in the scripting langauge but find it too

slow or too cumbersome, so add it as a language feature instead.

There is – as yet – no modular extensibility to the C: there are

about 35 thousand lines of C in 87 source files that I’ve written,

and a chunk more in various pieces of “foreign” code that are linked

in.

*CanalPlanAC scripting language.  I have no name for this language,

which can be invoked interactively by /cgi-bin/canal/ or to execute a

file as /cgi-bin/canal file.can.  This is a basic like language that deserves another dozen items on the syntax.  It was vaguely inspired by Superbasic which could give you a few pointers to some of the syntax.  Key things about it are:

**An intermediate typing system where all values are one of a number of types, and get coerced from one to the other when assigned.

**Support for associative arrays (called “lookup” tables) and lists

**Inline operators: this is a mixed blessing – a line like «if a contains “hello” then …» is easy to read, but precedence sometimes goes wrong.  (So «print ‘this is me and you’ after ‘me’ contains ‘you’» produces the whole string (because it parses as evalutate «’me’ contains ‘you’» which is 0, then returns the original string after character 0.

**Built-in comprehension of a waterways network with the ability to plan routes aroung it and travel along them.

**Built-in read/write of SQL, JSON and XML

*There are a few small bash scripts, mainly to kick off CanalPlan programs.

*There is the whole automake/autoconf nest of snakes in /source/ but that shouldn’t need touching.

*Many web pages use Javascript (both home written and some appropriately licensed utilities) to do place name lookup, display maps etc.