While configuring a new raid I ran iostat to see that it is idle. It was, and there was no io showing at all.
I then mounted it on a new mount point which I have no process using. I started hearing knocks from the PC case, and touching the disks revealed that they all had activity 1-2 times a second, concurrently, leading to the louder than usual noise.
Here is what "iostat 60" is now showing on a totally idle system (this is a very typical entry):
avg-cpu: %user %nice %system %iowait %steal %idle 0.21 0.00 0.19 1.36 0.00 98.24
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 0.73 0.93 4.07 56 244 sdb 5.87 8.53 20.78 512 1247 sdd 6.67 10.93 23.18 656 1391 sdf 8.50 40.27 52.52 2416 3151 sde 6.20 4.40 16.65 264 999 sdh 5.97 10.13 22.38 608 1343 sdg 7.77 38.00 50.25 2280 3015 sdc 5.87 8.53 20.78 512 1247 md127 1.80 0.00 40.27 0 2416
sda is the root fs (ext4). md127 (ext4) is a RAID6 of 7 disks sd[b-h]1.
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
This is up-to-date recent install of f28 (this is a test system, so not customized).
TIA
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
On 4/11/18 4:44 pm, Samuel Sieb wrote:
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
Sure, should have said a bit more: The array was resync'ed, then (much) data was copied in. This was a few days ago.
$ sudo inotifywait -rm /new-raid Setting up watches. Beware: since -r was given, this may take a while! Watches established. -> nothing listed for a few minutes
'lsofs' finds nothing relevant.
'lsofs +D' is likely to take too long: $ sudo find /new-raid/ -type d|wc -l 101078 $ sudo find /new-raid/ -type f|wc -l 1445375 It is running now...
On 4/11/18 5:32 pm, Eyal Lebedinsky wrote:
On 4/11/18 4:44 pm, Samuel Sieb wrote:
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
Sure, should have said a bit more: The array was resync'ed, then (much) data was copied in. This was a few days ago.
$ sudo inotifywait -rm /new-raid Setting up watches. Beware: since -r was given, this may take a while! Watches established. -> nothing listed for a few minutes
'lsofs' finds nothing relevant.
'lsofs +D' is likely to take too long: $ sudo find /new-raid/ -type d|wc -l 101078 $ sudo find /new-raid/ -type f|wc -l 1445375 It is running now...
Finished, listed nothing.
On Sun, 2018-11-04 at 17:32 +1100, Eyal Lebedinsky wrote:
On 4/11/18 4:44 pm, Samuel Sieb wrote:
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
Sure, should have said a bit more: The array was resync'ed, then (much) data was copied in. This was a few days ago.
Did the sync finish? What is the kernel status of the array? cat /proc/mdstat
On 4/11/18 6:39 pm, Berend De Schouwer wrote:
On Sun, 2018-11-04 at 17:32 +1100, Eyal Lebedinsky wrote:
On 4/11/18 4:44 pm, Samuel Sieb wrote:
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
Sure, should have said a bit more: The array was resync'ed, then (much) data was copied in. This was a few days ago.
Did the sync finish? What is the kernel status of the array? cat /proc/mdstat
Yes, as I mentioned, the sync finished (28/Oct), then the copy finished (30/Oct). Acquired a new case and installed the array there (using an old mobo) testing how the case ventilation performs. Waiting for a new mobo/CPU/mem.
It is during this quiet period that I noticed the io issue which got me wondering.
Since this happens only when the array is mounted, and I do not see any files being touched, I wondered if this is some ext4 internal housekeeping. Can this be related to the size of the fs? Filesystem Size Used Avail Use% Mounted on /dev/md127 55T 17T 38T 32% /new-raid
$ cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md127 : active raid6 sdc1[1] sdb1[0] sdf1[4] sdh1[6] sde1[3] sdd1[2] sdg1[5] 58593761280 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [UUUUUUU] bitmap: 1/88 pages [4KB], 65536KB chunk
unused devices: <none>
On Sun, 2018-11-04 at 18:57 +1100, Eyal Lebedinsky wrote:
On 4/11/18 6:39 pm, Berend De Schouwer wrote:
On Sun, 2018-11-04 at 17:32 +1100, Eyal Lebedinsky wrote:
On 4/11/18 4:44 pm, Samuel Sieb wrote:
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
Sure, should have said a bit more: The array was resync'ed, then (much) data was copied in. This was a few days ago.
Did the sync finish? What is the kernel status of the array? cat /proc/mdstat
Yes, as I mentioned, the sync finished (28/Oct), then the copy finished (30/Oct). Acquired a new case and installed the array there (using an old mobo) testing how the case ventilation performs. Waiting for a new mobo/CPU/mem.
It is during this quiet period that I noticed the io issue which got me wondering.
Since this happens only when the array is mounted, and I do not see any files being touched, I wondered if this is some ext4 internal housekeeping. Can this be related to the size of the fs?
Maybe. Maybe the journal. Maybe a runaway sync().
You can play with mount options like 'noatime.' Note that some mount options might cause data corruption. Look in /proc/mounts for the currently used options. See if there's something different to /.
Your original mail showed more activity on /dev/sdb .. sdh than on /dev/md127, so it might be raid housekeeping, or a ext4/raid barrier.
/dev/md127 showed only write access. Is that typical too?
The shortest way to know if it's ext4 is to re-format as xfs or btrfs. I don't suggest you do that lightly.
On 4/11/18 11:11 pm, Berend De Schouwer wrote:
On Sun, 2018-11-04 at 18:57 +1100, Eyal Lebedinsky wrote:
On 4/11/18 6:39 pm, Berend De Schouwer wrote:
On Sun, 2018-11-04 at 17:32 +1100, Eyal Lebedinsky wrote:
On 4/11/18 4:44 pm, Samuel Sieb wrote:
On 11/3/18 10:19 PM, Eyal Lebedinsky wrote:
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
I'm assuming since it's a new RAID that you haven't created files on it yet, or at least not many. Try running "lsof +D /mnt/point" to see if there is any process looking at it. Then try running "inotifywait -rm /mnt/point" and let it run for a little while to see if you can catch some process accessing the fs. You will need to install "inotify-tools" to get that program.
Sure, should have said a bit more: The array was resync'ed, then (much) data was copied in. This was a few days ago.
Did the sync finish? What is the kernel status of the array? cat /proc/mdstat
Yes, as I mentioned, the sync finished (28/Oct), then the copy finished (30/Oct). Acquired a new case and installed the array there (using an old mobo) testing how the case ventilation performs. Waiting for a new mobo/CPU/mem.
It is during this quiet period that I noticed the io issue which got me wondering.
Since this happens only when the array is mounted, and I do not see any files being touched, I wondered if this is some ext4 internal housekeeping. Can this be related to the size of the fs?
Maybe. Maybe the journal. Maybe a runaway sync().
The machine was reboot numerous times, same thing.
You can play with mount options like 'noatime.' Note that some mount options might cause data corruption. Look in /proc/mounts for the currently used options. See if there's something different to /.
Same options: /dev/sda2 on / type ext4 (rw,relatime) /dev/md127 on /new-raid type ext4 (rw,relatime,stripe=640)
'noatime' showa similar activity to 'realtime'. read-only mount stops this activity.
Your original mail showed more activity on /dev/sdb .. sdh than on /dev/md127, so it might be raid housekeeping, or a ext4/raid barrier.
This is OK. Looking at one entry in 'iostat 100':
avg-cpu: %user %nice %system %iowait %steal %idle 0.01 0.00 0.12 0.64 0.00 99.24
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 0.01 0.08 0.00 8 0 sdc 5.14 5.12 16.23 512 1623 sdg 6.50 26.36 37.47 2636 3747 sdh 5.86 7.80 18.91 780 1891 sdf 7.64 35.76 46.87 3576 4687 sde 6.64 19.76 30.87 1976 3087 sdd 5.14 5.12 16.23 512 1623 sdb 5.28 7.36 18.47 736 1847 md127 1.59 0.00 35.76 0 3576
For RAID6 'write' generates more activity than for a plain device. Even a small 'write' leads to whole stripes read/modify/write. Note that there are no 'read' operations on the array.
What I see is a periodic 'write' of about 20KB to md127, probably to the fs, This rate is very constant.
If I have to guess I would say this is some ext4 internal activity to a control area (not to a file in the fs).
/dev/md127 showed only write access. Is that typical too?
The shortest way to know if it's ext4 is to re-format as xfs or btrfs. I don't suggest you do that lightly.
On Sun, 4 Nov 2018 16:19:45 +1100 Eyal Lebedinsky wrote:
I then mounted it on a new mount point which I have no process using. I started hearing knocks from the PC case, and touching the disks revealed that they all had activity 1-2 times a second, concurrently, leading to the louder than usual noise.
If it is an ext4 filesystem, then a newly formatted ext4 gets background activity building some sort of internal data structures. It eventually stops doing it if makes it all the way through. Possibly other filesystems do something similar, but I know ext4 does.
On 4/11/18 11:49 pm, Tom Horsley wrote:
On Sun, 4 Nov 2018 16:19:45 +1100 Eyal Lebedinsky wrote:
I then mounted it on a new mount point which I have no process using. I started hearing knocks from the PC case, and touching the disks revealed that they all had activity 1-2 times a second, concurrently, leading to the louder than usual noise.
If it is an ext4 filesystem, then a newly formatted ext4 gets background activity building some sort of internal data structures. It eventually stops doing it if makes it all the way through. Possibly other filesystems do something similar, but I know ext4 does.
Interesting. The array was mounted for many hours, but maybe not long enough. Filesystem Size Used Avail Use% Mounted on /dev/md127 55T 17T 38T 32% /new-raid Seeing only 'write's suggests that it may be busy initializing some internal tables? I will leave it mounted overnight (or longer). Naturally, once commissioned it will stay mounted for years.
On 4/11/18 4:19 pm, Eyal Lebedinsky wrote:
While configuring a new raid I ran iostat to see that it is idle. It was, and there was no io showing at all.
I then mounted it on a new mount point which I have no process using. I started hearing knocks from the PC case, and touching the disks revealed that they all had activity 1-2 times a second, concurrently, leading to the louder than usual noise.
Here is what "iostat 60" is now showing on a totally idle system (this is a very typical entry):
avg-cpu: %user %nice %system %iowait %steal %idle 0.21 0.00 0.19 1.36 0.00 98.24
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 0.73 0.93 4.07 56 244 sdb 5.87 8.53 20.78 512 1247 sdd 6.67 10.93 23.18 656 1391 sdf 8.50 40.27 52.52 2416 3151 sde 6.20 4.40 16.65 264 999 sdh 5.97 10.13 22.38 608 1343 sdg 7.77 38.00 50.25 2280 3015 sdc 5.87 8.53 20.78 512 1247 md127 1.80 0.00 40.27 0 2416
sda is the root fs (ext4). md127 (ext4) is a RAID6 of 7 disks sd[b-h]1.
What is this io, and can it be stopped? I want to allow the disks to enter low power mode (not spin down) when idle.
This is up-to-date recent install of f28 (this is a test system, so not customized).
TIA
A summary of what I learnt so far:
I was pointed at the lazy init feature of ext4 as the culprit (read the thread). I was not aware of this feature so there is a silver lining to this cloudy issue.
After some searching I now see in mkfs.ext4 man page these two options lazy_itable_init, lazy_journal_init
Furthermore, I read about it here https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Lazy_Block_Group_Ini...
One can see the activity using iotop:
$ iotop -oP Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 103.76 K/s PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 1872 be/3 root 0.00 B/s 0.00 B/s 0.00 % 3.50 % [jbd2/md127-8] 1874 be/4 root 0.00 B/s 0.00 B/s 0.00 % 1.11 % [ext4lazyinit]
However it seems that there is no way to see how far the lazy init progressed or how much data needs to be written.
My other observation is that the RAID6 write amplification probably has a large effect if the init process is writing non sequential single blocks.
HTH
On 4/11/18 4:19 pm, Eyal Lebedinsky wrote:
[trimmed]
However it seems that there is no way to see how far the lazy init progressed or how much data needs to be written.
To close this thread, the lazy init is now finished and the disk activity ceased.
Still, it would have taken many days to complete, so after more searching I found that adding a mount option 'init_itable=0' (default is 10)will remove the intentional slowing. It did, and once remounted, the 'ext4lazyinit' put the pedal to the metal and the tps went from about 1.33 to around 140.