Release 2.0 - 22 November 2010
This document is a guide to the use of the original Ganymede 1.0 GanymedeBuilderTask-based synchronization system.
Most users new to Ganymede will want to avoid this style of implementing synchronization in favor of the newer and more simple XML synchronization system implemented in Ganymede 2.0.
Some users may be running legacy synchronization code from a Ganymede 1.0 installation, however, and others may need the flexibility to run custom logic to assemble their synchronization data.
This guide is for them.
This document should be read after reading the Sync Channel Synchronization Guide which, in addition to describing the new XML sync system, also provides a good conceptual overview of the ways in which synchronization can be carried out.
The original Ganymede synchronization model is based around full state synchronization, and was designed in conjunction with the Ganymede transactional model to support NIS and DNS. Both of those directory services require all related changes to be made all at once before rebuilding their state ("pushing the maps" in NIS terminology, "rebuilding the BIND zone files" in DNS terms).
Ganymede's approach to supporting these services is to implement a two phase building approach, using custom-written subclasses of the arlut.csd.ganymede.server.GanymedeBuilderTask class. When a transaction is committed in Ganymede, the arlut.csd.ganymede.server.GanymedeScheduler class schedules a set of builder tasks for execution as soon as possible. When a builder task runs, it locks the Ganymede database to prevent any new transactions from committing and it executes builderPhase1(). The builderPhase1() method scans the Ganymede database for whatever objects are relevant to the builder task's synchronization duties and writes that data out to disk, using whatever data format is convenient for the purposes of that synchronization channel. When the builder task has completed its scan of the Ganymede database and all of its data has been written out, the builderPhase1() method returns, and the Ganymede database is unlocked so that any pending transactions can proceed to commit.
The builder tasks then runs builderPhase2(). builderPhase2() is designed to execute an external program or script which reads the data written out by builderPhase1() and does whatever is necessary to integrate that data into the target directory services. During the time that builderPhase2() is running, users are free to commit changes into the Ganymede database, but the Ganymede Scheduler is blocked from re-running the same builder task again. This prevents any possibility of the builder task synchronization overlapping itself and overwriting data that a previous external build process may still be working with.
Because the Ganymede Scheduler waits until a builder task has completed running before it relaunches the task, transactions are effectively transmitted in bulk to the target service. If it takes 5 minutes for the external script run by builderPhase2() to complete, potentially hundreds or thousands of changes can be 'saved up' for the builder task to handle when it finishes with the first synchronization. In this way, the Ganymede server can allow users to commit changes at the fastest rate possible, while simultaneously synchronizing data to the target services at the fastest rate the synchronization process will allow.
There are two difficulties with this synchronization model. The first is that by the time a builder task is executed by the Ganymede Scheduler, several minutes might have passed, during which time the Ganymede server will have completely forgotten about what the Ganymede server's data looked like before the transactions were committed. Because of this, the GanymedeBuilderTask model is completely incapable of doing delta-style builds. The best that it can do is to write out everything it knows to be true at the time the builderPhase1() method is run, and to depend on the external build process run by builderPhase2() to do any necessary before-and-after comparisons between the synchronization and the previous target state.
The second difficulty with the GanymedeBuilderTask model is that it depends on a lot of custom code being written in Java and compiled into the Ganymede server. In order to create a custom GanymedeBuilderTask, the adopter must create his own subclass of GanymedeBuilderTask and write all the logic for builderPhase1() and builderPhase2(). Further, if the code in either of these methods needs to be changed, either to fix a bug or to respond to a change made in the Ganymede server's schema, the server would have to be stopped and restarted after compiling and loading the new version of the builder task's code.
Right now, we need to discuss how to actually go about constructing a GanymedeBuilderTask subclass to handle a full state build.
The GanymedeBuilderTask class is responsible for coordinating all of the activities involved in scheduling and executing full state builds. As an adopter and customizer of Ganymede, all you are responsible for is creating a subclass of GanymedeBuilderTask that defines the builderPhase1() and builderPhase2() methods.
The steps involved in creating a GanymedeBuilderTask
subclass and linking it into the server are very similar to those
involved in creating a
DBEditObject subclass. You have to create a .java
file that contains your subclass and
place it under the server's schema/custom_src
directory. You then
compile your code with the build
script. After you are sure that your code compiles successfully,
you must shut down the server, rebuild the custom.jar file with the
buildCustomJar
script, and install
the rebuilt custom.jar
into the appropriate location. You can then start up your server
and your new GanymedeBuilderTask
subclass should be available for use by the Ganymede server.
Available, but not yet configured for actual use.
Once you have updated your server's custom.jar file with your
new builder task, you have to register your new builder task with
the server. You do this by creating a new Task
object in the Ganymede server. The Task
Object type is fully described in the Ganymede Server
Overview.
For a builder task, there are four fields you need to worry about:
Task Name
-- The name of the Task
as seen in the admin console, must be unique in the Ganymede
server.Task Class
-- The fully qualified
class name for your custom GanymedeBuilderTask
subclass, as defined in your custom.jar file.Run On Transaction Commit
-- A
check-box, should be checked to indicate that this task is to be
scheduled for automatic execution.Option Strings
-- A Vector String
field, used to provide optional run-time configuration
information to builder tasks that are programmed to look for
it.
Option Strings
field should all be in the form: key =
value
.
Below is an example of a builder task that has been registered in the Ganymede server. It is configured to be scheduled for execution after transactions are committed, and it has a name and a class.
As soon as the admin hits the Commit
button at the bottom right, the task will be registered in the
server. From this commit on, the task will be scheduled for
execution whenever a transaction is committed.
Now, not every transaction or sequence of transactions committed will necessarily be of interest to any given builder task. To help you deal with this, GanymedeBuilderTask provides a method called baseChanged(), which allows you to test to see whether any changes have been made to a given object type since the last time your builder task was run. You'll see how this works in our example builder task in the next section.
What follows is a slightly simplified version of the UNIXBuilderTask class from the Ganymede userKit. It will illustrate a lot of the principles of writing a full-state builder task with the GanymedeBuilderTask class.
package arlut.csd.ganymede.userKit; import arlut.csd.ganymede.server.*; import arlut.csd.ganymede.common.*; import arlut.csd.Util.FileOps; import java.util.*; import java.text.*; import java.io.*;
First we have our beginning declarations. Generally speaking,
you'll create a separate package for your custom schema classes, and
you'll need to import everything from the
arlut.csd.ganymede.server
and
arlut.csd.ganymede.common
packages.
/** * This class is intended to dump the Ganymede datastore to the * UNIX passwd and group files. */ public class UNIXBuilderTask extends GanymedeBuilderTask { private String buildScript = null; private String passwdFile = null; private String groupFile = null; // --- private boolean backedup = false; /* -- */ /** * <p>The constructor for GanymedeBuilderTask subclasses takes an * Invid for the DBObject in the Ganymede data store that defines * the task. The taskDefObjInvid member variable is comes from the * GanymedeBuilderTask that we are inheriting from. Setting it * in this constructor allows us to use the getOptionValue() method to read * our configuration strings for this task.</p> */ public UNIXBuilderTask(Invid _taskObjInvid) { // set the taskDefObjInvid in GanymedeBuilderTask so // we can look up option strings taskDefObjInvid = _taskObjInvid; }
And here we see the class declaration subclassing GanymedeBuilderTask, member variable declarations, and our constructor, which must take an Invid parameter, and set the taskDefObjInvid variable from the GanymedeBuilderTask class.
/** * This method runs with a dumpLock obtained for the builder task. * * Code run in builderPhase1() can call enumerateObjects() and * baseChanged(). * * @return true if builderPhase1() made changes necessitating the * execution of builderPhase2(). */ public boolean builderPhase1() { PrintWriter out; boolean result = false; /* -- */ // now see what our current option values are passwdFile = getOptionValue("passwdFile"); if (passwdFile == null) { Ganymede.debug("UNIXBuilderTask: error, no passwdFile specified"); } groupFile = getOptionValue("groupFile"); if (groupFile == null) { Ganymede.debug("UNIXBuilderTask: error, no groupFile specified"); } // passwd if (baseChanged(SchemaConstants.UserBase)) { Ganymede.debug("UNIXBuilderTask: Need to build user map"); if (passwdFile != null) { out = null; try { out = openOutFile(passwdFile, "unix"); } catch (IOException ex) { ex.printStackTrace(); Ganymede.debug("UNIXBuilderTask.builderPhase1(): couldn't open " + passwdFile + ": " + ex); } if (out != null) { Enumeration users = enumerateObjects(SchemaConstants.UserBase); while (users.hasMoreElements()) { DBObject user = (DBObject) users.nextElement(); writeUserLine(user, out); } out.close(); } result = true; } } // group if (baseChanged((short) 257)) { Ganymede.debug("UNIXBuilderTask: Need to build group map"); if (groupFile != null) { out = null; try { out = openOutFile(groupFile, "unix"); } catch (IOException ex) { ex.printStackTrace(); Ganymede.debug("UNIXBuilderTask.builderPhase1(): couldn't open " + groupFile + ": " + ex); } if (out != null) { Enumeration groups = enumerateObjects((short) 257); while (groups.hasMoreElements()) { DBObject group = (DBObject) groups.nextElement(); writeGroupLine(group, out); } out.close(); } result = true; } } return result; }
And here we have our builderPhase1() method. The important things to note here are the use of the getOptionValue() method to look for option strings from our Task object, the use of the baseChanged() method to determine whether any objects of a given type have changed since this builder task was last run, the use of the openOutFile() method for opening files for writing, and the use of the enumerateObjects() method for iterating over objects of a given type. Each of these methods take advantage of logic built in to the GanymedeBuilderTask class, and will help your builder task behave properly in the Ganymede environment.
As you may notice above, baseChanged() and enumerateObjects() both take a numeric parameter (a short) to describe what object type the method is to operate on. This is a common idiom in the internal Ganymede DBStore API.
The overall effect of this implementation of builderPhase1() is to
look to see whether any users or groups, as defined in the userKit
schema, have changed since this builder task was last run, and if so,
to write out passwd and group files suitable for integration into a
simple NIS or /etc/passwd
, /etc/group
type environment.
If no users or groups have changed, this builderPhase1() method will return false, which will tell the GanymedeBuilderTask base class not to bother running builderPhase2().
/** * * This method runs after this task's dumpLock has been * relinquished. This method is intended to be used to finish off a * build process by running (probably external) code that does not * require direct access to the database. * * builderPhase2() is only run if builderPhase1() returns true. * */ public boolean builderPhase2() { File file; /* -- */ buildScript = getOptionValue("buildScript"); if (buildScript == null || buildScript.equals("")) { Ganymede.debug("UNIXBuilderTask: no buildScript defined in task object's option string vector"); Ganymede.debug(" not running any external update script."); return false; } file = new File(buildScript); if (file.exists()) { Ganymede.debug("UNIXBuilderTask builderPhase2 running"); try { FileOps.runProcess(buildScript); } catch (IOException ex) { Ganymede.debug("Couldn't exec buildScript (" + buildScript + ") due to IOException: " + ex); } catch (InterruptedException ex) { Ganymede.debug("Failure during exec of buildScript (" + buildScript + "): " + ex); } } else { Ganymede.debug(buildScript + " doesn't exist, not running external UNIX build script"); } Ganymede.debug("UNIXBuilderTask builderPhase2 completed"); return true; }
builderPhase2(),
by contrast, is much simpler. It uses the runProcess()
method to try to run a builder script that is designated in the Task
object's Option Strings
field. Finding and
running the external script is all that builderPhase2() is responsible
for. The external builder script is responsible for making sure that
everything is set up for it to run. The Ganymede server makes no
guarantees as to the working directory the script will be run in, and
no guarantees about what any environment variables will be set to. If
your external builder script needs to worry about things like that,
you should make sure that your external builder script sets all that
up for itself.
The other thing you need to know when writing your external builder script is that it should block until it finishes with its build. The Ganymede Scheduler creates an independent thread for each builder task that runs, and it is expected that that thread will block until your external builder task has completely finished its work. If your external builder script were to put itself into the background and return early, the Ganymede Scheduler would consider the builder task completely finished, and it would feel free to run the task all over again, which might cause your builderPhase1() method to overwrite files that your external builder script is still working with.
// *** // // The following private methods are used to support the UNIX builder logic. // // *** /** * * This method writes out a line to the passwd UNIX source file. * * The lines in this file look like the following. * * broccol:393T6k3e/9/w2:12003:12010:Jonathan Abbey,S321 CSD,3199,8343915:/home/broccol:/bin/tcsh * * @param object An object from the Ganymede user object base * @param writer The destination for this user line * */ private void writeUserLine(DBObject object, PrintWriter writer) { String username; String cryptedPass = null; int uid; int gid; String name; String room; String officePhone; String homePhone; String directory; String shell; PasswordDBField passField; Vector invids; Invid groupInvid; DBObject group; StringBuffer result = new StringBuffer(); /* -- */ username = (String) object.getFieldValueLocal(SchemaConstants.UserUserName); passField = (PasswordDBField) object.getField(SchemaConstants.UserPassword); if (passField != null) { cryptedPass = passField.getUNIXCryptText(); } else { Ganymede.debug("UNIXBuilderTask.writeUserLine(): null password for user " + username); cryptedPass = "**Nopass**"; } uid = ((Integer) object.getFieldValueLocal(userSchema.UID)).intValue(); // get the gid groupInvid = (Invid) object.getFieldValueLocal(userSchema.HOMEGROUP); // home group if (groupInvid == null) { Ganymede.debug("UNIXBuilderTask.writeUserLine(): null gid for user " + username); gid = -1; } else { group = getObject(groupInvid); if (group == null) { Ganymede.debug("UNIXBuilderTask.writeUserLine(): couldn't get reference to home group"); gid = -1; } else { Integer gidInt = (Integer) group.getFieldValueLocal(groupSchema.GID); if (gidInt == null) { Ganymede.debug("UNIXBuilderTask.writeUserLine(): couldn't get gid value"); gid = -1; } else { gid = gidInt.intValue(); } } } name = (String) object.getFieldValueLocal(userSchema.FULLNAME); room = (String) object.getFieldValueLocal(userSchema.ROOM); officePhone = (String) object.getFieldValueLocal(userSchema.OFFICEPHONE); homePhone = (String) object.getFieldValueLocal(userSchema.HOMEPHONE); directory = (String) object.getFieldValueLocal(userSchema.HOMEDIR); shell = (String) object.getFieldValueLocal(userSchema.LOGINSHELL); // now build our output line result.append(username); result.append(":"); result.append(cryptedPass); result.append(":"); result.append(Integer.toString(uid)); result.append(":"); result.append(Integer.toString(gid)); result.append(":"); result.append(name); if (room != null && !room.equals("")) { result.append(","); result.append(room); } if (officePhone != null && !officePhone.equals("")) { result.append(","); result.append(officePhone); } if (homePhone != null && !homePhone.equals("")) { result.append(","); result.append(homePhone); } result.append(":"); result.append(directory); result.append(":"); result.append(shell); if (result.length() > 1024) { Ganymede.debug("UNIXBuilderTask.writeUserLine(): Warning! user " + username + " overflows the UNIX line length!"); } writer.println(result.toString()); } /** * * This method writes out a line to the group UNIX source file. * * The lines in this file look like the following. * * adgacc:ZzZz:4015:hammp,jgeorge,dd,doodle,dhoss,corbett,monk * * @param object An object from the Ganymede user object base * @param writer The destination for this user line * */ private void writeGroupLine(DBObject object, PrintWriter writer) { String groupname; String pass; int gid; Vector users = new Vector(); PasswordDBField passField; Vector invids; Invid userInvid; String userName; StringBuffer result = new StringBuffer(); /* -- */ groupname = (String) object.getFieldValueLocal(groupSchema.GROUPNAME); // currently in the Ganymede schema, group passwords aren't in passfields. pass = (String) object.getFieldValueLocal(groupSchema.PASSWORD); Integer gidInt = (Integer) object.getFieldValueLocal(groupSchema.GID); if (gidInt == null) { Ganymede.debug("Error, couldn't find gid for group " + groupname); return; } gid = gidInt.intValue(); // we currently don't explicitly record the home group.. just take the first group // that the user is in. invids = object.getFieldValuesLocal(groupSchema.USERS); if (invids == null) { // Ganymede.debug("UNIXBuilder.writeUserLine(): null user list for group " + groupname); } else { for (int i = 0; i < invids.size(); i++) { userInvid = (Invid) invids.elementAt(i); userName = getLabel(userInvid); if (userName != null) { users.addElement(userName); } } } // now build our output line result.append(groupname); result.append(":"); result.append(pass); result.append(":"); result.append(Integer.toString(gid)); result.append(":"); for (int i = 0; i < users.size(); i++) { if (i > 0) { result.append(","); } result.append((String) users.elementAt(i)); } if (result.length() > 1024) { Ganymede.debug("UNIXBuilderTask.writeGroupLine(): Warning! group " + groupname + " overflows the UNIX line length!"); } writer.println(result.toString()); }
All of the rest of the class is dedicated to utility helper methods
that are used to handle chores for the builderPhase1()
method. One of the things you will notice about these methods is
that they have been written to use the Ganymede DBObject
class API to access data from objects in the Ganymede data store. You
will also notice that the calls to getFieldValueLocal()
are using numeric identifiers, in this case defined in the userSchema
and groupSchema
interfaces, to retrieve specific
fields.
In general, whenever you use the GanymedeBuilderTask model for handling synchronization, you will have to write Java code that uses low-level details of the internal Ganymede API and that is aware of the specific structure and details of your schema. You're quite correct if you consider the necessity of coding at this level to be a disadvantage of using the GanymedeBuilderTask approach.
The UNIXBuilderTask class is the piece of the GanymedeBuilderTask sync puzzle that is operating inside the Ganymede server. As you can see above, all the builderPhase2() method does is call out to an external builder script, which must read the files written by builderPhase1() and do something productive with them.
What "something productive" means may vary considerably from installation to installation, and that's up for you to determine. Whatever builder script you create will need to keep the following things in mind:
At ARL:UT, our Ganymede builder script is a simple Bourne shell script that sets environment variables and working conditions before running make on a makefile, redirecting make's output to log files for later examination. By creating appropriate makefile dependencies on the various synchronization files written by Ganymede, the make program can make decisions for us as to what pieces of the build need to be redone. If the Ganymede server didn't write a new group file, we don't bother doing any group updates. Building this sort of intelligence into your external builder script may reduce the time it takes to run the average build for your network.
If you want to reduce your average build time still further, you'll need to move to an incremental model for synchronization.