Some (intentional) changes from Sun's submission are noted in the install guide. Bugs or issues: The "full resync" part of the protocol involves the primary side firing off a normal kprop (and going back to servicing requests), and the replica side stopping all the incremental propagation stuff and waiting for the kprop. If the connection from the primary never comes in for some reason, the replica side just blocks forever, and never resumes incremental propagation. The protocol does not currently pass policy database changes; this was an intentional decision on Sun's part. The policy database is only relevant to the primary KDC, and is usually fairly static (aside from refcount updates), but not propagating it does mean that a replica maintained via iprop can't simply be promoted to a primary in disaster recovery or other cases without doing a full propagation or restoring a database from backups. Shawn had a good suggestion after I started the integration work, and which I haven't had a chance to implement: Make the update-log code fit in as a sort of pseudo-database layer via the DAL, being called through the standard DAL methods, and doing its work around calls through to the real database back end again through the DAL methods. So for example, creating a "iprop+db2" database would create an update log and the real db2 database; storing a principal entry would update the update log as well; etc. At least initially, we wouldn't treat it as a differently-named database; the installation of the hooks would be done by explicitly checking if iprop is enabled, etc. The "iprop role" is assumed to be either primary or replica. The primary writes a log, and the replica fetches it. But what about a cascade propagation model where A sends to B which sends to C, perhaps because A's bandwidth is highly limited, or B and C are co-located? In such a case, B would want to operate in both modes. Granted, with iprop the bandwidth issues should be less important, but there may still be reasons one may wish to run in such a configuration. The propagation of changes does not happen in real time. It's not a "push" protocol; the replicas poll periodically for changes. Perhaps a future revision of the protocol could address that. kadmin/cli/kadmin.c call to kadm5_init_iprop - is this needed in client-side program? Should it be done in libkadm5srv instead as part of the existing kadm5_init* so that database-accessing applications that don't get updated at the source level will automatically start changing the update log as needed? Locking: Currently DAL exports the DB locking interface to the caller; we want to slip the iprop code in between -- run it plus the DB update operation with the DB lock held, whether or not the caller grabbed the lock. (Does the caller always grab the lock before making changes?) Currently we're using a file lock on the update log itself; this will be independent of whether the DB back end implements locking (which may be a good thing or a bad thing, depending). Various logging calls with odd format strings like "" should be fixed. Why are different principal names used, when incremental propagation requires that normal kprop (which uses host principals) be possible anyways? Why is this tied to kadmind, aside from (a) wanting to prevent other db changes, which locking protocols should deal with anyways, (b) existing acl code, (c) existing server process? The incremental propagation protocol requires an ACL entry on the primary, listing the replica. Since the full-resync part uses normal kprop, the replica also has to have an ACL entry for the primary. If this is missing, I suspect the behavior will be that every two minutes, the primary side will (at the prompting of the replica) dump out the database and attempt a full propagation. Possible optimizations: If an existing dump file has a recent enough serial number, just send it, without dumping again? Use just one dump file instead of one per replica? Requiring normal kprop means the replica still can't be behind a NAT or firewall without special configuration. The incremental parts can work in such a configuration, so long as outgoing TCP connections are allowed. Still limited to IPv4 because of limitations in MIT's version of the RPC code. (This could be fixed for kprop, if IPv6 sites want to do full propagation only. Doing incremental propagation over IPv6 will take work on the RPC library, and probably introduce backwards-incompatible ABI changes.) Overflow checks for ulogentries times block size? If file can't be made the size indicated by ulogentries, should we truncate or error out? If we error out, this could blow out when resizing the log because of a too-large log entry. The kprop invocation doesn't specify a realm name, so it'll only work for the default realm. No clean way to specify a port number, either. Would it be overkill to come up with a way to configure host+port for kpropd on the primary? Preferably in a way that'd support cascading propagations. The kadmind process, when it needs to run kprop, extracts the replica host name from the client principal name. It assumes that the principal name will be of the form foo/hostname@REALM, and looks specifically for the "/" and "@" to chop up the string form of the name. If looking up that name won't give a working IPv4 address for the replica, kprop will fail (and kpropd will keep waiting, incremental updates will stop, etc). Mapping between file offsets and structure addresses, we should be careful about alignment. We're probably okay on current platforms, but if we break log-format compatibility with Sun at some point, use the chance to make the kdb_ent_header_t offsets be more strictly aligned in the file. (16 or 32 bytes?) Not thread safe! The kdb5.c code will get a lock on the update log file while making changes, but the lock is per-process. Currently there are no processes I know of that use multiple threads and change the database. (There's the Novell patch to make the KDC multithreaded, but the kdc-kdb-update option doesn't currently compile.) Logging in kpropd is poor to useless. If there are any problems, run it in debug mode ("-d"). You'll still lose all output from the invocation of kdb5_util dump and kprop run out of kadmind. Other man page updates needed: Anything with new -x options. Comments from lha: Verify both client and server are demanding privacy from RPC. Authorization code in check_iprop_rpcsec_auth is weird. Check realm checking, is it trusting the client realm length? What will happen if my realm is named "A" and I can get a cross realm (though multihop) to ATHENA.MIT.EDU's iprop server? Why is the ACL not applied before we get to the functions themselves?