Some (intentional) changes from Sun's submission are noted in the
install guide.

Bugs or issues:

The "full resync" part of the protocol involves the primary side
firing off a normal kprop (and going back to servicing requests), and
the replica side stopping all the incremental propagation stuff and
waiting for the kprop.  If the connection from the primary never comes
in for some reason, the replica side just blocks forever, and never
resumes incremental propagation.

The protocol does not currently pass policy database changes; this was
an intentional decision on Sun's part.  The policy database is only
relevant to the primary KDC, and is usually fairly static (aside from
refcount updates), but not propagating it does mean that a replica
maintained via iprop can't simply be promoted to a primary in disaster
recovery or other cases without doing a full propagation or restoring
a database from backups.

Shawn had a good suggestion after I started the integration work, and
which I haven't had a chance to implement: Make the update-log code
fit in as a sort of pseudo-database layer via the DAL, being called
through the standard DAL methods, and doing its work around calls
through to the real database back end again through the DAL methods.
So for example, creating a "iprop+db2" database would create an update
log and the real db2 database; storing a principal entry would update
the update log as well; etc.  At least initially, we wouldn't treat it
as a differently-named database; the installation of the hooks would
be done by explicitly checking if iprop is enabled, etc.

The "iprop role" is assumed to be either primary or replica.  The
primary writes a log, and the replica fetches it.  But what about a
cascade propagation model where A sends to B which sends to C, perhaps
because A's bandwidth is highly limited, or B and C are co-located?
In such a case, B would want to operate in both modes.  Granted, with
iprop the bandwidth issues should be less important, but there may
still be reasons one may wish to run in such a configuration.

The propagation of changes does not happen in real time.  It's not a
"push" protocol; the replicas poll periodically for changes.  Perhaps
a future revision of the protocol could address that.

kadmin/cli/kadmin.c call to kadm5_init_iprop - is this needed in
client-side program? Should it be done in libkadm5srv instead as part
of the existing kadm5_init* so that database-accessing applications
that don't get updated at the source level will automatically start
changing the update log as needed?

Locking: Currently DAL exports the DB locking interface to the caller;
we want to slip the iprop code in between -- run it plus the DB update
operation with the DB lock held, whether or not the caller grabbed the
lock.  (Does the caller always grab the lock before making changes?)
Currently we're using a file lock on the update log itself; this will
be independent of whether the DB back end implements locking (which
may be a good thing or a bad thing, depending).

Various logging calls with odd format strings like "<null>" should be
fixed.

Why are different principal names used, when incremental propagation
requires that normal kprop (which uses host principals) be possible
anyways?

Why is this tied to kadmind, aside from (a) wanting to prevent other
db changes, which locking protocols should deal with anyways, (b)
existing acl code, (c) existing server process?

The incremental propagation protocol requires an ACL entry on the
primary, listing the replica.  Since the full-resync part uses normal
kprop, the replica also has to have an ACL entry for the primary.  If
this is missing, I suspect the behavior will be that every two
minutes, the primary side will (at the prompting of the replica) dump
out the database and attempt a full propagation.

Possible optimizations: If an existing dump file has a recent enough
serial number, just send it, without dumping again?  Use just one dump
file instead of one per replica?

Requiring normal kprop means the replica still can't be behind a NAT
or firewall without special configuration.  The incremental parts can
work in such a configuration, so long as outgoing TCP connections are
allowed.

Still limited to IPv4 because of limitations in MIT's version of the
RPC code.  (This could be fixed for kprop, if IPv6 sites want to do
full propagation only.  Doing incremental propagation over IPv6 will
take work on the RPC library, and probably introduce
backwards-incompatible ABI changes.)

Overflow checks for ulogentries times block size?

If file can't be made the size indicated by ulogentries, should we
truncate or error out?  If we error out, this could blow out when
resizing the log because of a too-large log entry.

The kprop invocation doesn't specify a realm name, so it'll only work
for the default realm.  No clean way to specify a port number, either.
Would it be overkill to come up with a way to configure host+port for
kpropd on the primary?  Preferably in a way that'd support cascading
propagations.

The kadmind process, when it needs to run kprop, extracts the replica
host name from the client principal name.  It assumes that the
principal name will be of the form foo/hostname@REALM, and looks
specifically for the "/" and "@" to chop up the string form of the
name.  If looking up that name won't give a working IPv4 address for
the replica, kprop will fail (and kpropd will keep waiting,
incremental updates will stop, etc).

Mapping between file offsets and structure addresses, we should be
careful about alignment.  We're probably okay on current platforms,
but if we break log-format compatibility with Sun at some point, use
the chance to make the kdb_ent_header_t offsets be more strictly
aligned in the file.  (16 or 32 bytes?)

Not thread safe!  The kdb5.c code will get a lock on the update log
file while making changes, but the lock is per-process.  Currently
there are no processes I know of that use multiple threads and change
the database.  (There's the Novell patch to make the KDC
multithreaded, but the kdc-kdb-update option doesn't currently
compile.)

Logging in kpropd is poor to useless.  If there are any problems, run
it in debug mode ("-d").  You'll still lose all output from the
invocation of kdb5_util dump and kprop run out of kadmind.

Other man page updates needed: Anything with new -x options.

Comments from lha:

Verify both client and server are demanding privacy from RPC.

Authorization code in check_iprop_rpcsec_auth is weird.  Check realm
checking, is it trusting the client realm length?

What will happen if my realm is named "A" and I can get a cross realm
(though multihop) to ATHENA.MIT.EDU's iprop server?

Why is the ACL not applied before we get to the functions themselves?