Performance discussion
by William Brown
Hi all,
After our catch up, we were discussing performance matters. I decided to start on this while waiting for some of my tickets to be reviewed and to see what's going on.
These tests were carried out on a virtual machine configured in search 6 to have access to 6 CPU's, and search 12 with 12 CPU. Both machines had access to 8GB of ram.
The hardware is an i7 2.2GHz with 6 cores (12 threads) and 32GB of ram, with NVME storage provided.
The rows are the VM CPU's available, and the columns are the number of threads in nsslapd-threadnumber. No other variables were changed. The database has 6000 users and 4000 groups. The instance was restarted before each test. The search was a randomised uid equality test with a single result. I provided the thread 6 and 12 columns to try to match the VM and host specs rather than just the traditional base 2 sequence we see.
I've attached a screen shot of the results, but I have some initial thoughts to provide on this. What's interesting is our initial 1 thread performance and how steeply it ramps up towards 4 thread. This in mind it's not a linear increase. Per thread on s6 we go from ~3800 to ~2500 ops per second, and a similar ratio exists in s12. What is stark is that after t4 we immediately see a per thread *decline* despite the greater amount of available computer resources. This indicates that it is poor locking and thread coordination causing a rapid decline in performance. This was true on both s6 and s12. The decline intesifies rapidly once we exceed the CPU avail on the host (s6 between t6 to t12), but still declines even when we do have the hardware threads available in s12.
I will perform some testing between t1 and t6 versions to see if I can isolate which functions are having a growth in time consumption.
For now an early recommendation is that we alter our default CPU auto-tuning. Currently we use a curve which starts at 16 threads from 1 to 4 cores, and then tapering down to 512 cores to 512 threads - however in almost all of these autotuned threads we have threads greater than our core count. This from this graph would indicate that this decision only hurts our performance rather than improving it. I suggest we change our thread autotuning to be 1 to 1 ratio of threads to cores to prevent over contention on lock resources.
Thanks, more to come once I setup this profiling on a real machine so I can generate flamegraphs.
—
Sincerely,
William Brown
Senior Software Engineer, 389 Directory Server
SUSE Labs
4 years, 6 months
Couple of troubles around using dsconf
by Matus Honek
Hello folks,
Context: My setup is a running dscontainer with exported /data. While
developing (outside of the container) I am trying to run `dsconf
ldapi://%2fpath%2fto%2fdscontainers%2fsocket security get`.
Issue 1: I get IndexError exception:
File "/home/mhonek/src/ds/up/src/lib389/lib389/_mapped_object.py",
line 158, in display
How to fix the fact we can get no results to display, and to fix it
correctly so that nothing else eventually blows up? Don't know...
Issue 2: Tracing back I find out I autobinded as non-root (non 0 UID).
Expectable, but still unexpected. So I tried to override this by
providing `-D` and `-w` explicitly to dsconf. No change, still
autobinding. Turns out the autobind has preference over simple bind in
DirSrv.open, this comes from [implementation].
Possible solution: Instead of `elif can_autobind(): ... else:
simple_bind` do `elif self.binddn is not None: ... else
can_autobind(): ...`. Worked for me. Would this blow up some use-case?
Don't know...
Sub-issue 2a: Given I was able to autobind as non-root UID, the
wording in a log message [aubind-log]. The word "root" probably
shouldn't be there?
Somewhat troubling 1: At the time of running open in the autobind
branch in DirSrv.open [autobind] the value of `self.bindpw` is
literally "password" even though no `-D` nor `-w` was provided on
command line for dsconf. I believe there are some reasons (besides
"because the code is written so") why this is so but I would like to
be enlightened here.
[implementation] https://pagure.io/389-ds-base/c/e07e489
[autobind-log] https://pagure.io/389-ds-base/blob/6d70cbe/f/src/lib389/lib389/__init__.p...
[autobind] https://pagure.io/389-ds-base/blob/6d70cbe/f/src/lib389/lib389/__init__.p...
Please, share your ideas, where I went wrong, what we could go ahead with.
Thanks,
Matus
--
Matúš Honěk
Software Engineer
Red Hat Czech
4 years, 6 months