Version: 0.2
Last modified: September 30, 2003
Author: D. de Leeuw Duarte
This document is subject to change. Please report errors and omissions to the author. Your suggestions and requests regarding this Howto are more than welcome.
This document intends to teach you how to write custom modules for the Strongroom Document Management System. Provided that you meet the prerequisites outlined below, you should be able to create a custom Strongroom module by following the instructions outlined in this document. If you have questions or suggestions regarding this document, please contact the author.
Module development is limited by your imagination only. Therefore, the answer to this question is closely tied to the nature of your plans. If you want to write a single threaded user module for local shell clients, module development can take as little as a few hours. On the other hand, if you plan on writing a distributed networking fileserver, you might be looking at a few weeks. To give a rough indication: the Strongroom Team (3 people) wrote the first four modules (InetUserServer, ODBCSecurityPlugin, ODBCMetadataPlugin, LocalFilePlugin) as well as the complete Strongroom infrastructure (StrongroomCore and module base classes), a C++ and a Java client library, a Java shell client and all the additional utility classes in slightly over eight weeks (40 hours per week). All in all, writing a module is a very tractable project, even for a single developer.
In this section we will look at the things you need to know to successfully write a Strongroom module. Paragraph 1.2.1 describes the things you should know about the Strongroom system itself. Paragraph 1.2.2 describes the skills necessary to complete this project.
Before you read this document, we recommend that you familiarise yourself with Strongroom's general design. The implementation of a module is a fairly self descriptive process, but it is good to have a general idea of the major subsystems in Strongroom and their functions. Other documents are available that describe Strongroom's design. (FIXME: insert name of that document here)
We assume that you are familiar with the system documentation on your platform (info and/or man pages).
One of Strongroom's major design obectives was to make it easily extensible. The modular architecture and the API's were all designed to make third party development as easy and clean as possible. Because of this, the development of a Strongroom module should be well within the reach of even the moderately skilled developer. On the other hand, the module concept allows for maximum design flexibility to provide an interesting challenge to the expert as well. In practice, the skill level required is greatly dependent on the complexity of the specific extension you have in mind. Because of its design criteria, Strongroom provides an excellent opportunity to gain experience in Free Software development.
The native language of the Strongroom system is C++. In order to write a module, you should have at least a decent working knowledge of this language. Such a basic knowledge is assumed throughout this text from here on forward.
Strongroom makes use of the GNU Build Tools in order to provide maximum portability. Since writing the proper files for these tools can be rather tricky, we will provide detailed information on how to do this. Therefore, you do not need to be familiar with the ins and outs of Automake, Autoconf, Autolib and Autoheader. A general knowledge of their existence and function will suffice.
A running Strongroom system is made up of a core system and five modules. The core system is addressed throughout this document as 'StrongroomCore' or just 'the Core'. Each of the five modules has a unique responsibility within the system. Therefore, there are actually five different types or categories of modules. The names of the five module types are:
"The User Module is responsible for spawning threads."
" The bio-neural implant User module will be released next month."
Since Strongroom modules are dynamically loaded at runtime as plugins, we will also use the word 'plugin' and 'module' interchangeably in this text.
Before you begin to work on your own module, we will present a bird's eye view of the procedure.
Strongroom modules are completely defined on the outside by abstract C++ base classes. This means that, if you want to write your own module, you will need to extend the appropriate base class and give useful implementations of each of the pure virtual methods described in this base class. This way, the compiler forces you to supply the appropriate functionality to the rest of the Strongroom system, in particular to the StrongroomCore. The actual behaviour of your module with respect to these methods will obviously be more or less unique to your module.
In the next section, we will start by obtaining the necessary files. These files include the header files containing the definition of the module base classes, the definition of the Strongroom errors, the various utility classes, etc. You will also become familiar with the structure of Strongroom's source tree, to the extent necessary.
From there, we will look at how the base class for your new Strongroom module should be declared. We will proceed to build a 'skeleton' module with empty methods. We will show how to make this skeleton module compileable and dynamically linkable (don't worry, it is very easy).
Once you have come that far, you can start to give your module a meaningful implementation. To aid you in this process, some additional information is supplied, describing the quirks, the exact responsibilities the challenges and the caveats of each specific module type.
While in theory you do not nearly need all the files in the Strongroom source tree to successfully compile a module, we recommend that you do get the complete tree. In fact, in the remainder of this document, we will assume that you have the complete source at your disposal. Details on where to get the source can be found on the Strongroom Project homepage (FIXME: Provide link). To write a module, it is not necessary to have full CVS access. A local copy will suffice, unless of course you plan to work on the module in a team.
The 'src' directory in the strongroom tree contains all the source code, as you might expect. When applicable, we will look at some of its subdirectories, but for now, go to the 'src/modules' subdirectory. The 'modules' directory is where all implementations of modules reside and so will yours. There are subdirectories in 'modules' for each type of module. If you want to create a User Module for example, you will have to make a new subdirectory in 'src/module/user/' with an appropriate, meaningful name. Go ahead and do so.
The subdirectory you just created is yours. Whatever you do in there is entirely up to you. Later on, you will have to create a few necessary files there, but otherwise, you are free to structure your sources at will. For the remainder of this tutorial, you will only create and alter files in this directory, unless specifically stated otherwise.
Creating a module is as easy as extending a C++ base class. In this step, you will be creating a new, empty header file and defining an extended class to the appropriate base class. Next, you will make the module loadable and provide a set of empty methods. The skeleton module will then be ready to compile.
We will use an example project in the remainder of this section. Our exemplary module will be of the User Module variety. This new module will serve remote users that connect to Strongroom through a special hardware device from wrist watches that run Linux. Our module will be called WristWatchUserModule. We will roughly assume that you are smart enough to substitute the module variety and file names that apply to your situation in these examples.
Step one, as mentioned earlier, is to define your own class. Step zero, however, is to create a file called WristWatchUserModule.h in your directory. To make this file 'safe' to include, we will adopt the standard practice of including the whole file in ifndef/define/endif blocks. After steps zero and one, your file will hopefully look something like this:
#ifndef __WRISTWATCHUSERMODULE_H_ #define __WRISTWATCHUSERMODULE_H_ using namespace std; class WristWatchUserModule { public: WristWatchUserModule(); ~WristWatchUserModule(); private: }; #endif /*__WRISTWATCHUSERMODULE_H_*/
Next, you will have to identify the appropriate base class. The base classes are all defined in appropriately named header files in the 'src/include' directory. Since our wrist watch server is a User Module, we will extend this class from the UserPlugin base class. (Yes, the base classes are all called XXXPlugin, not XXXModule). This class is, not surprisingly, defined in the 'UserPlugin.h' file in 'src/include'. (If you are writing a security module, you will be extending from SecurityPlugin, defined in SecurityPlugin.h, etc.)
Since we will supply neat makefiles later, we will only have to include the line..
#include <UserPlugin.h>
..at the top of the file that contains your new module class. Don't forget to make your class extend the base class. Change the declaration to:
Class WristWatchUserModule : public UserPlugin { ... };
Now that you have correctly declared your module, you will need to declare the necessary methods as well. To find out which methods you must implement, you will need to find out which methods were declared pure virtual in the base class of your module. Further, all module base classes are themselves base classes of the Plugin class. This class also requires you to implement some methods.
For our example, we will open the Plugin.h file in 'src/include/plugin'. You will also need to do this. In this very small base class, we see only one pure virtual method: virtual bool init(). (A pure virtual method is identified by the virtual keyowrd before, and '= 0;' behind its declaration.) This means that we will have to add it to the WristWathcUserModule if it is not implemented in the UserPlugin base class (trust me, it isn't!). Next, we open the UserPlugin.h file in the same directory. Here, we see two pure virtual methods! They are run() and shutdown() (attachCore() is virtual too, but not pure virtual!). This means that we will have to add these three methods to WristWatchUserModule as well. (Note that the user module has only two pure virtual methods, but other modules can have around 15.) Let's have a look at our class declaration again in our WristWatchUserModule.h after this:
Class WristWatchUserModule : public UserPlugin { public: //Contructor/Destructor WristWatchUserModule(); ~WristWatchUserModule(); //Required by Plugin.h: bool init(); //Required by UserPlugin.h: CoreError run(); void shutdown(); private: };
At this point, you might want to take a look at 'include/error.h'. This file contains the declaration and description of the CoreError type. This file will be included through UserPlugin.h automatically, but it will not hurt if you include it again.
Now that you have declared your module and all of its methods, you can create a .cpp file in your module'sdirectory (or in a subdirectory of it, if you desire). Later on, this file will contain the actual implementation of the module. For now, we will fill it with 'empty' implementations of all the methods and the destructor. (We will hande the constructor afterward.):
bool WristWatchUserModule::init() { //To do. } CoreError WristWatchUserModule::run() { //To do. return CORE_MODULE_ERROR; } void shutdown() { //To do. }
Next, we will add the constructor. This constructor must call the base class' constructor as well. This gives us a bit of a problem, because the base class requires some kind of Config object to be passed along. This config object is something that Strongroom will pass to your module when it is loaded. It contains the configuration information from your module's config file. This way, an administrator can pass configuration details to your module. To allow for this mechanism to work, and to solve our little problem, we will change the module's main constructor in WristWatchUserModule.h so that it always expects a Config object:
class WristWatchUserModlue : public UserPlugin { public: WristWatchUserModule( const Config &cfg );
Likewise, we will include an empty implementation in WristWatchUserModule.cpp, that calls the base class:
WristWatchUserModule::WristWatchUserModule( const Config &cfg ) : UserPlugin( cfg ) { //To do. }
// Strongroom loadable module support extern "C" { UserPlugin *initPlugin( Config &config ) { return new InetUserServer( config ); } }
..and that's it! Your module is now correctly declared, compileable and dynamically loadable.
The final step in the creation of your skeleton module is to build the code you have just written. The correct way to do this is to use the GNU Build Tools. This way, your module can be included seamlessly inte the complete Strongroom package. This in turn makes compilation and installation a breeze on most end-user systems, including your own. Therefore, it is also recommended to use these tools (as opposed to 'classic' makefiles) even if you do not intend to redistribute your module.
The GNU Build Tools were developed to make the lives of users easier at the cost of making developer life slightly more difficult. Luckily, we already know what files you should create and what to put in them, so we will just present a walkthrough here. If you want to learn more about the GNU Build Tools, please consult their documentation. (FIXME: provide link)
First, you will have to create an empty file named 'Makefile.am' in your module's root directory. In our case the Makefile.am file would go in the 'src/modules/user/wristwatchmodule/' directory. Open this file in your editor. The first step is to specify the locations of all the files you have included in your module. To do this, add a line like this to your 'Makefile.am':
INCLUDES = -I$(top_srcdir)/src/include
This example only includes the main include directory that holds the base class definitions and other general Strongroom includes. If you used files in a subdirectory of your own directory or something else in the source tree, you will need to add exra '-I$(top_srcdir)/somedirectory/somesubdir' clauses on the same line.
Next, you will have to specify the directory where your compiled library module will end up. To do this, just copy the exact line below:
libdir = $(STRONGROOM_PLUGINDIR)
The next step is to specify the name of the library that will contain your module. Just choose a name that is descriptive and appropriate and add the line:
lib_LTLIBRARIES = strongroom_user_wristwatch.la
Next up, you will have to specify a list of all the source (.cpp) files you use in your module (excluding those you did not write yourself.) For instance, if you wrote some kind of utility class named Foo, in addition to your module's .cpp file, you will add:
strongroom_user_wristwatch_la_SOURCES = WristWatchUserServer.cpp Foo.cpp
Next, please add the line:
strongroom_user_wristwatch_la_LDFLAGS = -module -avoid-version
The next step is to list all the libraries you used. For instance, if you used threads, you will want to link to the thread library. To accomplish this, add (for example):
strongroom_user_wristwatch_la_LIBADD = -lstdc++ -lpthread
Now, you need to specify where the configuration file for your module is to be stored, and what its name is (choose a name yourself). You accomplish this by adding these two lines:
moduleconfdir = $(sysconfdir)/strongroom moduleconf_DATA = user_wristwatch.xml
Read the remainder of this document for further details on the configuration file.
The last thing you need to do in order to actually build your module, is to add the name of your newly created subdirectory to the Makefile.am file in the module's parent directory. So, if you made a User Module, open 'src/include/modules/user/Makefile.am' and add the name of your module's directory to the SUBDIRS list.
As a final note: we have reviewed the process of writing a Makefile.am for Strongroom modules here. Note that you will have to update this file whenever you choose to add more libraries or sources to your module.
Your module is now integrated seamlessly into the entire Strongroom package. Because of this, your module will be automatically built or rebuilt with the package itself, with the commands that are familiar from all GNU software packages. You can simply go to the root directory of the complete source tree and run the commands:
./configure make
If you have created the correct Makefile.am in your module's directory and updated the Makefile.am of the directory above it, the './configure' command will generate all the necessary makefiles and 'make' will build your module (and the rest of Strongroom) if necessary. For subsequent builds, you can just run 'make' from your module's directory. For more details on how to actually load, build and use your module, see the general Strongroom documentation (FIXME: provide link).
In this section, we will have a detailed look at the different modules. Specifically, we will review the tasks, challenges and caveats that are unique for each module type. After reading this section, you should be able to give your skeleton module a good, safe and meaningful implementation. Although it is only necessary to read the subsection specific to your module, it can be helpful to glance over the other modules as well.. This will aid to your understanding of the Strongroom system as a whole and your module's part in it.
The StrongroomCore class offers methods for each major user-operation that Strongroom can perform (e.g. create a document, get document, change document, etc.). Obviously, users cannot call methods themselves. This is where the User Module comes in. The User Module's main task is to transform a user command into a StrongroomCore call. This is why the User Module has so little pure virtual methods in its base class. The User Module is the only module that initiates most of the actions. The other modules are passive. If you write a User Module, you should look at the StrongroomCore API (described in the documentation and in StrongroomCore.h). This is the API that you will be making available to the user. The User Module base class has a pointer to the StrongroomCore. This pointer is automatically attached to the running StrongroomCore when your User Module is loaded.
The User Module can take its input from a user in many ways. For instance, the User Module can be a shell client running in a single thread on the same machine as Strongroom itself. It can also be a multithreaded networking server with secure connections and XML-RPC support. The end user may be a real user, some client software or a software-based agent. Writing a new User Module is essentially the same as giving Strongroom a new user interface. If you want to connect to Strongroom in a new, unsupported way, write a User Module.
When the User Module does a Core call on the user's behalf, it musy supply a user ID number to identify the user. Dependent on the rights associated with this ID number, the system will either perform or refuse the requested operation. At this point, it should be clear that users may NEVER supply this user ID number themselves! It is the User Module's responsibility to obtain this number from the Core prior to a Core method call. The correct procedure is outlined here.
First, the User Module should supply a username. Then, the User Module must use this username to obtain a challenge from the core, using the supplied Core method. This challenge may then be presented to the user. Finally, the User Module must call the core's login method with the login name,the challenge and the user's response to this challenge. The core will then return the user's ID number if login is succesful. This ID number can safely be used by the User Module to call Core methods on behalf of that user.
The challenge/response mechanism described allows for very special logn procedures. Since the response is a block binary data, it is possible to support fingerprint identification, for example. The actual mechanism used is dictated by the Security Module. This makes it somewhat necessary to know which Security Module is being used when writing a User Module. However, in general, the Security Module (which actually supplies the challenge and evaluates the response) will give a challenge like "Supply password for user 'foo'".
The File Module's main responsibility is to store raw bitstreams. Essentially, the File Module is Strongroom's permantent storage medium. The File Module has methods for the creation and deletion of files and for the reading and writing of data blocks. The File Module does not concern itself with security issues, ownership or version control. Instead, this module just stores flat files and gives out numbers to identify them.
The File Module definition does not impose any rules as to where and how the files should be stored. This means that you can write a file module that either simply stores files in a local directory or on a remote file server. You can choose to store the files in their original form or in an encrypted form. Yet another interesting and meaningful application would be to store multiple copies of the files in different locations to allow graceful degradation when some of the storage media fail.
The main point of concern when writing a File Module is that of reliability. It is unacceptable if a file module loses data in the midst of a system crash. On the other hand, it might not be a great problem for your specific application. Therefore, when implementing a File Module, try to think of its intended use and try to ensure that files are actually written to disk when it is closed. Further, remember that the Core may call your module more than once in different threads. Either ensure thread safe operation or implement a spooling mechanism.
The Metadata Module is responsible for maintaining all information about documents. This information includes both user metadata (the author, title, subject, etc.) and the system level metadata (what files in the file modue are part of this document, what is the MIME-type of those files, etc.). The only exception here is the security metadata (access information) which is stored in the Security Module. All document information is contained in instances of the Document class, whic is defined in 'src/include/document.h'. The Metadata Module is responsible for the persistent storage of these objects. Much of the document-specific interaction has already been implemented in the Document class itself. Be sure to examine the Document class before you start building a Metadata Module.
When writing a Metadata Module, there is little functionality to add in terms of policy. The reason for this is that the Document objects already offer maximum flexibility. The Document class allows users to add any kind of metadata to a document. It also keeps track of all the revisions and files belonging to a document. The main challenge when implementing a Metadata Module lies in its mechanism. You can write a Metadata Module that handles requests in a clever way, perhaps using extensive caching for multiple req uests of the same document. You can also create a Metadata Module that shares its data with other systems in a distributed environment.
The main issue to look out for when writing a Metadata Module is metadata consistency. Caching is all well and good, but it isn't a very good idea to use caches for anything but reading. When the system crashes, it is imperative that any changes in metadata have actually been written back to permanent storage. Remember that the Metadata Module manages the glue to keep all flat files together. If the metadata gets corrupted, potentially millions of files containing valuable information will reduced to meaningless rubble! When your module destroys valuable data once, nobody will really care that your module is particularly fast. This is something to keep in mind at all times: consistency comes first, efficiency second!
The Documents in a Strongroom system may be created, deleted, read and modified by users. To perform one of these actions, the user needs to have permission. The main task of the Security Module is to arbitrate in this matter. Unless your Security Module is going to allow everybody or nobody to access all documents, it will have to maintain some data on users and their rights regarding documents. For instance, a Security Module might maintain access lists for each document. Such a list contains the ID's of all the users that have access to the file.
One unique feature of Strongroom is that it allows the Security Module to filter search results performed by a user. This means that, if a user does not have read access to a file, the Security Module can prevent it from being found by the user altogether. The Security Module needs to implement a method to support this filtering of queries. This method is documented in the API reference.
The fact that Strongroom allows for different security models provides a great challenge to developers. At the moment, no full strength implementation exists. This means that all thinkable security models still need to be implemented: an access control list model (ACL) a group based model, a capability based model, unlimited access security model, etc., etc.
Aside from the security model, there is much to do in terms of the security mechanism. Many companies use LDAP directories to store their user information. It is possible to write a Security Module that connects to such an LDAP directory for its user administration. Alternatively, it is possible to extend on the current database-driven Security Module. Again, there are many opportunities here.
Apart from the obvious responsibilities of the Security Module, which should inherently be implemented with care and in a robust manner, there are no 'special' pitfalls.
The Search Module is a module that can be queried by users in order to obtain document identifiers. Essentially, the Search Module's task is to provide some structure and accessibility to the great number of virtually anonymous documents in the system. In conventional DMSes, the search engine is not part of the system. Instead, it behaves more or less like a user with unlimited access that indexes files and answers queries by other users. The search engine is often supplied by a different vendor and its query language is often highly specific to its algorithm.
Because this traditional scheme often leads to security compromises (the users can use the search engine to peek into restricted documents), Strongroom approaches the search engine on the users behalf. To allow for an unlimited variety of query languages, the queries are passed to the Search Module 'as is'. The Search Module can then interpret the query itself (a built in search engine) or pass the query to a third party search engine (external search engine). Eventually, the search results will have to be returned by the Search Module as a list of Document identifiers. This list of query results is always screened by the Security Module before it is returned to the user. This way, a user will never be able to even find any documents to which he or she has no access. It also means that the search engine can be a custom Strongroom search engine or a third party application with a generic API. The search engine does not need to 'know' about Strongroom.
At the time of writing, there are no implementations of Search Modules. The first and foremost challenge is to create a Search Module that allows full text search of the documents contained in the system. This can be done by connecting Strongroom to an existing engine or by writing a new one from scratch. After this, the next obvious step would be to provide more advanced disclosure schemes, like a category system, a thesaurus, etc.
Search Modules have not yet been implemented. Therefore, there are no known caveats and we are interested in hearing your feedback!
Sometimes it is practical and/or necessary to allow administrators to pass special configuration information to your module. Strongroom has a special XML configuration-file mechanism to support this. To get an idea of how these configuration files can be used, let us have a look at Strongroom's network User Module as an example.
The networking User Module (InetUserServer) has special parameters to control the number of maximum simultaneous connections. Obviously, such limits are not common to all User Modules: some User Modules have no mechanism for multiple connections at all, so an upper limit would be pointless. Instead, the author of the InetUserServer decided that there should be a special value, MAX_CONNECT, present in the configuration file, so the system administrator can supply this value.
How, then, can your module get information out of its config file? The answer is very easy. There is one config file for the Core. It is supplied to the Strongroom system on startup. This configfile is itself an XML file containing references to the modules that should be loaded and to the config files that belong to those modules. (For detailed instructions on how to edit these files, see the Strongroom manual (FIXME: provide link))
The entire configuration file for each module is now automatically read on startup and converted to a Config object. This object is what your module receives upon construction. As long as the contents of your config file are valid XML, you can extract whatever information you want from it, including the pure XML string, through the Config object's methods. The config object, like all other objects, is fully documented in the online API reference (FIXME: Provide link). It is specified in 'src/include/config.h' and its documentation can be read directly from this file as well.
A thorough reference on all recommendations can be found in the Strongroom report. This document is available on the Strongroom project site.
The Strongroom system is licensed under the GNU LGPL. This means that the entire system is freely available. It also means that you are allowed to modify, copy, redistribute and even sell Strongroom, if and only if you release this new version under the LGPL as well. You are, however, allowed to write modules for Strongroom that are not released under the LGPL. This way, it would be possible to make a custom Strongroom module and sell it in binary form only.
As you will have guessed, the Strongroom Team prefers Free Software licenses. While you, in turn, are completely free to distribute your module under every license you desire, we would like to take this place to encourage you to use the (L)GPL as well. Please consult the GNU website for more information on Free Software and the (L)GPL licenses.