Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo much slower than identical hardware with Ubuntu
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
hertfelder
n00b
n00b


Joined: 13 May 2016
Posts: 5

PostPosted: Fri May 13, 2016 1:04 pm    Post subject: Gentoo much slower than identical hardware with Ubuntu Reply with quote

Dear all,

I have found out, that my PC with 4.4.6-gentoo is considerably slower when running our CFD-codes than that of a colleague which has exactly the same hardware, but Ubuntu as an OS (mine is slower by ~50 %). The slow-down is most dramatic when a large number of grid cells are used, i.e. when the arrays are very large and a lot of memory access is performed. I verified that there are no hardware issues on my maschine by booting from a Ubuntu Live CD and running the test. Here, I got the same fast results as my colleague.

I suspect that the problem is due to my kernel config. However, I don't know where to start looking and I also don't know what I should be looking for. Maybe somebody can help me here.

kernel config: https://bpaste.net/show/d2e8adf56bed
lshw: https://bpaste.net/show/ceea4a8e0156
emerge --info: https://bpaste.net/show/e6843577ddfa

Thanks,

Marius
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 44153
Location: 56N 3W

PostPosted: Fri May 13, 2016 3:58 pm    Post subject: Reply with quote

hertfelder,

Welcome to Gentoo. This looks a bit odd. Its a list of all the CPU frequency controllers your kernel knows about.
Code:
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_COMMON=y
# CONFIG_CPU_FREQ_STAT is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set


The
Code:
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
says that by default, the kernel is relying on userspace to control the CPU frequency, so the question arises about what is userspace doing, if anything.

Run the Gentoo/Ubuntu compare again and look at the CPU clock speed in both during the test.

Read the kernel context sensitive help on the various governors. You can switch CPU governors by poking about in /proc.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
chithanh
Developer
Developer


Joined: 05 Aug 2006
Posts: 2152
Location: Berlin, Germany

PostPosted: Fri May 13, 2016 4:25 pm    Post subject: Reply with quote

Modern Intel CPUs perform better with intel_pstate instead of the frequency scaling drivers. Choose the performance governor and enable CONFIG_X86_INTEL_PSTATE.
Back to top
View user's profile Send private message
1clue
Advocate
Advocate


Joined: 05 Feb 2006
Posts: 2562

PostPosted: Fri May 13, 2016 5:41 pm    Post subject: Reply with quote

CONFIG_CPU_FREQ_GOV_ONDEMAND seems to work fine for me with an atom c2758, 8-core. And CONFIG_X86_INTEL_PSTATE of course.
Back to top
View user's profile Send private message
YukiteruAmano
n00b
n00b


Joined: 31 May 2015
Posts: 29
Location: Venezuela

PostPosted: Sat May 14, 2016 12:13 am    Post subject: Reply with quote

Activate this option on your kernel, CONFIG_X86_INTEL_PSTATE.

With new Intel CPU, PSTATE is the best control scaling driver for them, using userspace, you have a lot problem with performance.

The others scaling driver you switch off.
_________________
Dios en su Cielo, todo bien en la Tierra
Back to top
View user's profile Send private message
hertfelder
n00b
n00b


Joined: 13 May 2016
Posts: 5

PostPosted: Wed May 18, 2016 1:09 pm    Post subject: Reply with quote

Thanks a lot for your replies!

I see your point concerning the CPU frequency control. I messed that up when configuring the kernel for the first time.
There is no need for userspace control in my case. Therefore, as adviced, I switched to intel_pstate and switched of
the other drivers. This indeed improves the performance of the CPU, my test cases are running faster by ~10 % now.
However, it's still slower by ~50% as compared to the identical setup with Ubuntu. I have the feeling, this is somehow
memory (access) related, since the CPU itself is now as fast as the other system (eg. sysbench cpu test). So, do you
guys have any advice or idea where to look for additional misconfigured kernel options?

Thanks!
Back to top
View user's profile Send private message
chithanh
Developer
Developer


Joined: 05 Aug 2006
Posts: 2152
Location: Berlin, Germany

PostPosted: Wed May 18, 2016 2:13 pm    Post subject: Reply with quote

You are using INTEL_PSTATE with the performance governor, yes?

If you are on a NUMA system, then you may want to enable CONFIG_NUMA_BALANCING. Then there is transparent hugepages support but that will benefit only certain use cases and/or software specially written for it (but worth a try in your case I guess).
I don't see anything else obviously wrong with the kernel config. Some things are quite unusual like missing CONFIG_FHANDLE but should not impact performance. You could disable CONFIG_DEBUG_KERNEL or CONFIG_KPROBES, but that would give only small performance gains.

What you could do is start at Ubuntu kernel config and then work your way towards your current one, checking at which point the performance decreases.
Back to top
View user's profile Send private message
hertfelder
n00b
n00b


Joined: 13 May 2016
Posts: 5

PostPosted: Thu May 19, 2016 11:09 am    Post subject: Reply with quote

Thanks chithanh, the transparent hugepages support did the trick :) Now, my machine is faster by 10-15 % compared to the identical
Ubuntu machine. Do you have a link or some advice where I can read up on this hugepages stuff? I am quite sure that neither software
I used for testing is intentionally exploiting this feature. Therefore, I would be interested to know why they can benefit so much from
the hugepages support.

For reference, I am now using INTEL_PSTATE with the performance governor, enabled NUMA_BALANCING, FHANDLE and disabled
DEBUG_KERNEL and CONFIG_KPROBES.

Thanks again!
Back to top
View user's profile Send private message
chithanh
Developer
Developer


Joined: 05 Aug 2006
Posts: 2152
Location: Berlin, Germany

PostPosted: Thu May 19, 2016 11:49 am    Post subject: Reply with quote

Here is a LWN article about transparent hugepages: https://lwn.net/Articles/423584/
But I don't have any deep knowledge on the subject, so probably you would have to tap someone else's expertise if you have questions. :)
Back to top
View user's profile Send private message
pilla
Administrator
Administrator


Joined: 07 Aug 2002
Posts: 7694
Location: Pelotas, BR

PostPosted: Thu May 19, 2016 1:21 pm    Post subject: Reply with quote

There is two different kinds of memory addresses: logical and physical.

Logical addresses are what processes see. When using pages as the memory management technique, this logical address space is just a bunch of contiguous addresses that are divided in pages (usually of the same size). Any logical address can thus be divided in two parts, number of page and an offset. This is made in a transparent way to the process, so you don't have to worry about it.

Physical addresses are used to address the physical RAM. When using pages, they are divided into frames of the same size as the logical pages. The address is also divided in two parts, frame and offset.

Now we have to map logical into physical addresses in order to effectively access the memory. That means mapping pages into frames. Pages from a process may be mapped into any frame in any place of the physical memory. Hence, a logically contiguous process may have its pages scattered all around the memory.

Notice that the offset needs no mapping. That part of the address is just copied from the logical to the physical address, constituting the less significant bits of it.

How is that mapping done? There are different implementation details to it, but there is basically a page table (usually one for each process) in which an entry contains the frame id for a given page (among other bits, that are not important for us now). This table resides in memory, of course, which makes it very slow to access when we consider that for every memory access it must be accessed too. So if your memory takes 100ns to be accessed, then we are talking about doubling this time in order to account for the extra access to the page table every time a load or store is made, or even for fetching an instruction (if they aren't already in cache). This is not even the worse case, as operating systems are using multi-level page tables that require extra accesses......

Then there is a thing called TLB, Translation Look-Aside Buffer, basically a small completely associative cache for page tables. It resides inside your processor so it is fast, but also small. Every time you don't get a hit on that TLB, there goes the memory management unit to fetch it from the page table.

Going back to pages and frames. The smaller they are, the lesser waste of memory due to processes not using their entire capacity. But that means more entries in the page table, and more misses in the TLB for many programs. Hence, increasing the size of pages reduces the amount of misses in the TLB, thus improving performance. If you have enough memory to afford for the waste it may cause in the cases where pages are too big for some processes, then it may be a good alternative for your system.

There are other issues such as having to move pages from and to disk. Larger pages that have little useful content have to be read and written to disk by virtual memory, which may not be efficient in these cases.
_________________
"I'm just very selective about the reality I choose to accept." -- Calvin
Back to top
View user's profile Send private message
hertfelder
n00b
n00b


Joined: 13 May 2016
Posts: 5

PostPosted: Fri May 20, 2016 11:44 am    Post subject: Reply with quote

Thanks for the link, chithanh, and the explanation pilla! I think I am starting to get a feeling for it.
This might be quite interesting for our applications; I will have a look if I can play around with the
options (page size,...) a little.

Marius
Back to top
View user's profile Send private message
pilla
Administrator
Administrator


Joined: 07 Aug 2002
Posts: 7694
Location: Pelotas, BR

PostPosted: Fri May 20, 2016 12:19 pm    Post subject: Reply with quote

hertfelder wrote:
Thanks for the link, chithanh, and the explanation pilla! I think I am starting to get a feeling for it.
This might be quite interesting for our applications; I will have a look if I can play around with the
options (page size,...) a little.

Marius


Any good textbook in Operating Systems will give you many good insights on the basics. You can skip directly to "Memory Management" and "Virtual Memory" for the specifics of memory that you are interested. I favour books authored by Silberschatz.

Hennessy&Patterson's Computer Organization and Design: The HW/SW Interface is a good read if you want to know more about memories and caches, but it is quite deep in the computer architecture stuff. Probably not interesting if you did not have some introductory courses in computer architectures before it.
_________________
"I'm just very selective about the reality I choose to accept." -- Calvin
Back to top
View user's profile Send private message
hertfelder
n00b
n00b


Joined: 13 May 2016
Posts: 5

PostPosted: Wed May 25, 2016 1:02 pm    Post subject: Reply with quote

Thanks for the suggestions, pilla!

I checked our library's outfit and found the Operating System Concepts by Silberschatz. This looks like a good read!
Back to top
View user's profile Send private message
pilla
Administrator
Administrator


Joined: 07 Aug 2002
Posts: 7694
Location: Pelotas, BR

PostPosted: Wed May 25, 2016 1:46 pm    Post subject: Reply with quote

hertfelder wrote:
Thanks for the suggestions, pilla!

I checked our library's outfit and found the Operating System Concepts by Silberschatz. This looks like a good read!


You are welcome. If you have doubts I might be able to help. I give lectures in Operating Systems.
_________________
"I'm just very selective about the reality I choose to accept." -- Calvin
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum