Stef Walter [2017-02-08 8:36 +0100]:
It still seems that for the purpose of programmatically pre-seeding machines it makes sense to be able to put a host key right into a machines/foo.json file, as otherwise we'd have the same "concurrent racy access" problem as for machines.json itself.
In large and properly setup systems/clusters, all the known hosts will be made globally available. Either via a shared drop in file ... or a better example of this is how FreeIPA uses the following configuration option to lookup known hosts from the directory:
ProxyCommand /usr/bin/sss_ssh_knownhostsproxy -p %p %h
So I would be cautious here about having another way to distribute SSH known hosts. /etc/ssh/known_hosts is a well known format and there's lots of tooling for populating it.
If our interface for pre-populating hosts is a CLI only, then using /etc/ssh/known_hosts is sufficient indeed -- but then we also don't need to split machines.json either, as that tool could just as well change machines.json inline (and do locking, etc) in the same way as known_hosts is updated.
I thought the goal was to provide a drop-in dir where machines snippets can be added -- but then we should also offer providing the host keys as well. Otherwise...
Note that in the corner case, even if the dashboard machines were prepopulated and SSH known host were missing, then the user will be prompted in Cockpit with the fingerprint when accessing the machine.
... you'll get this, and this feels like an incomplete solution. In practice, nobody will actually verify the fingerprints. How is it only a corner case? It will happen every time machines.json gets updated and the first time someone connects to that new machine. This will also affect e. g. VMs that get dynamically added to the list (if someone uses that feature for this), so this won't only be a "do it once, ever" effect.
Lastly, if we do need to add this feature later (ie: an additional host key field in the machines/foo.json file) we can certainly add it without backwards compatibility problems.
Sure, this doesn't need to be in the first PR, but I think we need to keep it in mind for designing this: It seems of questionable utility to me to split machines.json without also splitting known_hosts (or even just merging the two).
Thanks,
Martin