cfvers introduction

Iustin Pop

$Id: manual.xml 71 2003-10-28 21:56:30Z iusty $

This document explains the concept and usage of cfvers, a system tool designed to help with the versioning the configuration files on a system.


Table of Contents
1. About this document
2. Introduction
3. Quick start
4. Concepts
4.1. Repository syntax
4.2. Object definitions
4.2.1. Area objects
4.2.2. Item objects
4.2.3. Area revision objects
4.2.4. Revision entries
5. Limitations
5.1. POSIX VFS layer limitations
5.2. Repository limitations

1. About this document

This is the usermanual for the cfvers project; homepage is at http://www.nongnu.org/bakonf/. You can also get new versions of this document there.

Revision: $Id: manual.xml 71 2003-10-28 21:56:30Z iusty $


2. Introduction

Making backup is an important aspect of system administration. The techniques of backing up data are explained in any good document about system administration, and they won't be explained here again.

However, the text configuration files are more suited to versioning systems than to full/incremental backups which are targeted at binary files and miscellaneous data. Unfortunately, the versioning systems are not very good at working directly live on the system: the main reasons are creation of extra-files, inability to cope with special files and with keeping permissions intact.

The working model of the classic versioning systems is one (or more) composed of a central repository (very precious) and a multitude of developer's workspaces, which hold semi-important data; by this I mean it's ok to delete or otherwise break a developer's workspace when no changes have been performed to it - all state can be restored from central repository.

In contrast, a versioning system designed for system configuration has its priorities almost reversed: the critical issue is with the filesystem, and the repository is secondary to that. This means that such a software should obey the following rules:

cfvers has been designed with these objectives in mind[1].


3. Quick start

How to create your first repository

  1. decide on which back-end to use (either sqlite or postgresql for now), and configure it in /etc/cfvers.conf, like this:

    
	    [repositories]
    	    #For sqlite one
    	    ;default=sqlite:/var/lib/cfvers/repo.sqlite
    	    #For postgresql
    	    ;default=postgres::cfvers
    	    #This is the default:
    	    ;default-area=default
    	  

  2. run cfvadmin --init in order to create the initial repository.

  3. run cfv store ITEMS... in order to register (and store the first version of) the items you want versioned.

  4. after every change to the system's configuration, rerun the cfvers store command in order to update the versioned items. New items you want stored must be given in a separate call.

  5. schedule a cron job to watch for differences or do automatic commits.


4. Concepts

I tried to keep cfvers as simple as possible. It's implemented in Python, and uses an SQL repository (encapsulated of course in a class and easily replaced if someone is so inclined).


4.1. Repository syntax

The repository syntax is used whenever you need to specify a repository to cfvers: in configuration files and using the -d option to cfv and cfvadmin.

Generally speaking, the string specifying the repository is composed of two parts:

  • the backend driver, specifying the database used to store the repository

  • connection-specific information

These are joined together using a colon, like this: backend:conn_info.


4.2. Object definitions

The following objects are defined:

area

Each repository consists of one or several 'areas', each area rooted at a specific point in the filesystem. Usually you'll have one area, rooted at "/".

items

Each object to be versioned is defined by an 'item'. Right now, only files are supported, but more has been thought of.

area revision

Each new revision to an area has several parameters; they together form an area revision.

revision entry

The data of each revision an item is encompassed in a revision entry. These revision entries are linked to an area revision.


4.2.1. Area objects

The 'area' concept has been invented in order to ease the keeping of different sets of configuration files in one repository. In later versions, migration of config files from one area to another could be a possibility.

The properties of an area are:

  • name - text; the name of the area and primary key;

  • root - text; where in the filesystem the area is rooted; all operations will be relative to this (as if in a chroot)

  • ctime - timestamp; the creation date for this area;

  • description - text; free-flow description of this area;


4.2.2. Item objects

Each versioned item is represented by this object. In the SQL repository, it is represented by a row in the 'items' table.

The basic properties of an item are:

  • id - integer; represents the primary key

  • area - text; represents the area this item belongs to

  • name - text; the name of the item; until we implement renames (and deletions) this is equal to all the item's revisions entries name.

  • ctime - timestamp; the creation time of the item; useful for knowing when it has entered the repository


4.2.3. Area revision objects

The common parameters for all item's revision sharing the revision number are gathered in the area revision object. These include attributes like: log message, timestamp of the revision, commiter information, etc.

Table 1. Area revision attributes

NameTypeDescription
areatextthe name (primary key) of the parent area
revnointegerthe ID of the area revision
logmsgtextthe log message for this revision
ctimetimestampthe creation date of this revision
uidintegerthe uid, gid of the creation process
gid
commitertextfree-form description of commit type; can be used to differentiate between manual and automatic commits
servertextthe hostname of the server on which the commit was made.

4.2.4. Revision entries

Each revision of each item is represented in the database using a revision entry. This object is stored in the 'revisions' table.

The metadata is stored in various fields (attributes) of the table (objects). For regular files, the contents of the file is stored in various ways, depending on the contents, in order not to violate the constraints of each backend. ASCII text files are stored as-is, while binary files be encoded (using either base64 or quoted-printable, whichever is shorter).

Table 2. Revision entries attributes

NameTypeDescription
itemintegerThe ID of the item to which this revision belongs.
revnointegerThe revision number of this revision entry.
filenametextThe filename this entry represents.
filetypeintegerThe file type of this entry, one of ST_IF* values
filecontentstextthe encoded contents of the file, for file types that have such a thing
sha1sumtextthe SHA1 checksum over the unencoded filecontents
sizeintegerthe size of the file
modeintegerthe st_mode entry in the stat result
mtimeintegerThe modification, access and change time for this inode
atime
ctime
inodeintegerThe inode number of this file
deviceintegerThe device on which the inode resides
nlinkintegerNumber of links to this inode
uidintegerThe UID/GID of the owner/group of this file
gid
rdevintegerFor device files, their major/minor mode
blocksintegerThe number of blocks occupied by this file and the size of the blocks, if the operating system/file system reports these
blksize
encodingtextInformation about how the filecontents has been encoded

5. Limitations

This section should be very big. It's small because I didn't have time to fill it, not because cfvers is complete :-)


5.1. POSIX VFS layer limitations

These are limitations or design decisions inherent to the POSIX specification or the GNU/Linux implementation. While developing cfvers, I found:

  • You can't change the ctime of an inode. This is by design in the POSIX filesystem layer: the ctime is for metadata modifications, and the mtime/atime pair for data write/read accesses. Thus a ctime modification would trigger a ctime modification, since the ctime itself is part of metadata, rendering useless the ctime modification :). A read attribute for the metadata would be innapropriate, I think, because such reads are made in a great amount.

  • utimes(2) and chmod(2) acts on the destination of a symlink (when given an argument which is a symlink). I can't think why anyone would like this (you could always expand the symlink using readlink, but right now you can't act on the symlink!).


5.2. Repository limitations

  • SQLite: File size is limited to 1MiB. This should not be of great concern (the file will be bzip'ed first, and only if the compressed+base64 file size exceeds the size the commit is aborted), and cfvers is aiming at small configuration files, but still...

Notes

[1]

However, nobody said it attained these goals - after all, it software!