On Wed, 1 Oct 2003, Ingo Molnar wrote:
<snip>
it's not the HD that is keeping up things - it's the non-overlap of CDROM and HD IO that hurts. We use the CDROM, then we use the HD to install the rpm, then we use the CDROM again, etc. - instead of using them in parallel and cutting latencies into half.
Not that it would be impossible to do, but any sort of approach that interleaves cdrom and hd i/o is going to have to take into account switchinig of cdroms. Also, and correct me if a I am wrong but all the rpm's are installed as one transaction using librpm. Basically, a transaction of all the rpms that you need is built, and then ran. Anaconda passes to librpm a callback such that anaconda can report status, and switch cdroms for RPM (sick but its true (-;). In order to interleave the cdrom access (which is mainly for reading rpms) I think you would somehow have to drill something into rpm to allow for this, as its ultimately the one reading from the cdrom, and then installing to the system. If you followed the pattern of what is done for the cdrom change, you would have to add another callentry point that would get called before the rpm is opened for reading, that would pass pass filehandle back to rpm (this is what is done to allow for cdrom changing). This is of course convoluted and unless it was somehow made optional such that it only occured in the anaconda environment (i.e. anaconda turned this "feature" on) then it would break everything using librpm to install packages. The other approach would be to add threading to rpm. Now I know Jeff Johnson has been looking into doing that in order to do parallel installs of packages, but I don't think he was thinking of having a seperate thread to read an rpm, and another to output it to the disk (simplified I know).
Anyway, all I am really trying to express that the road to what you are wanting to do would ultimately produce convolutions that will either make rpm or anaconda or both harder to support for very limited gains. Probably, Jeff's work toward's parallel installs will actually give you some of what you want as the psm threads (package state machine) would likely most of the time be interleaving (though not on purpose) cdrom and disk i/o. This path that Jeff is taking has its own set of gotchas mainly centered around the fact that it means that scriplets that modify things on the system need to employ some sort of locking mechansim to make sure no two scriptlets touch the same file at the same time. This though is not a problem for rpm to solve as scriptlets are opaque (as they should be) but for designers of rpms to solve.
Well I have probably said to much and not enouugh, but I hope this sheds at least some light.
Cheers...james
ext2/ext3 only speeds up the HD access (by a very small amount). Also, ext2/ext3 mostly differs in CPU overhead not IO overhead - and the install-to-hd process is mostly limited by IO latencies (disk seeks).
Ingo
-- fedora-test-list mailing list fedora-test-list@redhat.com http://www.redhat.com/mailman/listinfo/fedora-test-list