Building a stable server for Linux compiling
Forum rules
Do not post support questions here. Before you post read the forum rules. Topics in this forum are automatically closed 6 months after creation.
Do not post support questions here. Before you post read the forum rules. Topics in this forum are automatically closed 6 months after creation.
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Building a stable server for Linux compiling
Firstly, I apologize in advance. I am in no way attempting to be offensive. I appreciate everyone’s time and advice. I personally do not have a Intel or AMD preference. I use what works for me.
It is my opinion that when you compile software on extreme stable platforms, they have a higher success rate upon completion. Despite all the internet hype, AMD desktop computer design is a huge disappointment to me, for my purposes. It is fast at the compromise of stability. And for most users, gamers included, I am sure the small compromise to stability makes little to no difference to them.
I recently built an AMD Ryzen 5950X 16/32core system and now regret it. I used the most advanced motherboard I could find and the best air cooling solution that would still fit in a 4U server case. I have tested several different kernels, including 5.16. However 5.14 runs as stable as I can make it. I also have to run Windows, which now crashes with the 5950X. It has a suspend to memory issue. There are also virtualization problems. Wine instability that I do not have on any of my other systems.
I wish/goal is to compile different kinds of programs, packages, including kernels. To do this, I have considered to build a very stable server to compile on. AMD Epyc is just not as stable as Intel Xeon, although Epyc is less expensive. Some of the new Xeon cpus are very expensive, priced at $20,000 us dollars each. I cannot afford two of these cpus. Instead I am considering the purchase two Xeon 8081 @ $5000.00 each. These will be loaded onto an X11DPI-N. 28 core, 56 threads 2.5Ghz each times two. I have been given the advice to wait another year for 10nm to come down in price or 7nm around the corner.
Your thoughts, opinions, experience on this?
It is my opinion that when you compile software on extreme stable platforms, they have a higher success rate upon completion. Despite all the internet hype, AMD desktop computer design is a huge disappointment to me, for my purposes. It is fast at the compromise of stability. And for most users, gamers included, I am sure the small compromise to stability makes little to no difference to them.
I recently built an AMD Ryzen 5950X 16/32core system and now regret it. I used the most advanced motherboard I could find and the best air cooling solution that would still fit in a 4U server case. I have tested several different kernels, including 5.16. However 5.14 runs as stable as I can make it. I also have to run Windows, which now crashes with the 5950X. It has a suspend to memory issue. There are also virtualization problems. Wine instability that I do not have on any of my other systems.
I wish/goal is to compile different kinds of programs, packages, including kernels. To do this, I have considered to build a very stable server to compile on. AMD Epyc is just not as stable as Intel Xeon, although Epyc is less expensive. Some of the new Xeon cpus are very expensive, priced at $20,000 us dollars each. I cannot afford two of these cpus. Instead I am considering the purchase two Xeon 8081 @ $5000.00 each. These will be loaded onto an X11DPI-N. 28 core, 56 threads 2.5Ghz each times two. I have been given the advice to wait another year for 10nm to come down in price or 7nm around the corner.
Your thoughts, opinions, experience on this?
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Re: Building a stable server for Linux compiling
Basically, no, other of course than in the sense of the system not crashing during the compile. A compiled program is a bag-of-bytes and if you'd with the otherwise same system including same compiler and settings for which find any literal single bit difference between a compile on system 1 and system 2 with just a different CPU you'd have found a compiler bug.Michael_Hathaway wrote: ⤴Sun Dec 05, 2021 5:37 pm It is my opinion that when you compile software on extreme stable platforms, they have a higher success rate upon completion.
I do not otherwise have many well-founded opinions on stability of server-type AMD vs Intel systems. 4U would seem big enough for sufficient cooling but I'd explicitly monitor that; a 5950X would tend to be coupled with high-end liquid cooling certainly.
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
The 5950 has a TDP of 105W, unlike the thread rippers which are 280W. Never the less, I have the largest Noctua heat sink and high pressure fans I could get in a 4U. It was very unstable the first 48 hours, but has calmed down after several heat cycles. I am thinking that I should replace the ram with ECC and just use this ram in a different system. My core temps are typically 69-71C at a forced full load.
Re: Building a stable server for Linux compiling
Temp looking good then. While ECC is a good idea in and of itself, I would say it shouldn't be necessary; that if the issue is memory related you're likely more looking at not very compatible memory. Ryzen is (or was...) indeed more sensitive and you'd buy or bought RAM from e.g. G-Skill which specifically mentioned good Ryzen support.
In any case not much relevant to otherwise add, maybe other than noting that both Linus Torvalds and Greg KH, citizen 1 and 2 of Linux so to speak, use an AMD Threadripper 3970X to compile kernels all day long...
In any case not much relevant to otherwise add, maybe other than noting that both Linus Torvalds and Greg KH, citizen 1 and 2 of Linux so to speak, use an AMD Threadripper 3970X to compile kernels all day long...
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
rene, thank you for taking the time to answer, you advice is appreciated and welcome.
Yes, Linus uses the Threadripper 3970X, he also uses ECC registered memory. But he doesn't actually compile the final released kernel on that machine. They have an enterprise server that compiles it for them. I found 64GB of ECC that is compatible with my motherboard for $355 usd. I am going to try that and see if that improves my system stability.
Yes, Linus uses the Threadripper 3970X, he also uses ECC registered memory. But he doesn't actually compile the final released kernel on that machine. They have an enterprise server that compiles it for them. I found 64GB of ECC that is compatible with my motherboard for $355 usd. I am going to try that and see if that improves my system stability.
Re: Building a stable server for Linux compiling
That's somewhat oddly put. Linus doesn't release a compiled kernel; integrates code and recompiles the kernel to test said code/integration all day long and, yes, simply on his own machine. Of course many automated build systems/farms to then in turn test what he releases (and for testing things before he's even send anything in the co-called "-next" tree) but note that a kernel compile as such on hardware such as that takes under a minute; is very much a thing any kernel developer does "at home" like any user would.
Re: Building a stable server for Linux compiling
Agree 100%. Of course, earlier attempts at compiling and building may have destabilized the machine.rene wrote: ⤴Sun Dec 05, 2021 5:56 pmBasically, no, other of course than in the sense of the system not crashing during the compile. ..Michael_Hathaway wrote: ⤴Sun Dec 05, 2021 5:37 pm It is my opinion that when you compile software on extreme stable platforms, they have a higher success rate upon completion.
For every complex problem there is an answer that is clear, simple, and wrong - H. L. Mencken
-
- Level 5
- Posts: 574
- Joined: Mon Oct 29, 2012 6:29 pm
- Location: Texas
Re: Building a stable server for Linux compiling
Just throwing this out there. Your sig shows " Deb12+Mint20.3 packages - 5.14.0-4". Maybe this is part of the stability issue? I mention it only because the system in my sig runs current Mint Cinnamon, current Fedora Cinnamon, openSUSE and other distros (my boot menu is ridiculous) and all are rock solid. Among many other things, I compile quite a bit and have not experienced any issues.
Not to say there is not an issue, just not sure the issue is hardware related.
Not to say there is not an issue, just not sure the issue is hardware related.
AMD Ryzen 9 5950X 16C/32T | MSI MPG x570 Gaming Plus | 2TB Mushkin Pilot-E NVMe | 1TB Crucial P1 NVMe | 2x 2TB Inland Gen4 NVMe | 32GB Trident Z DDR4 3600 | Nvidia RTX4090 | Fedora 39 Cinnamon | Linux Mint 21.3 Cinnamon | Kernel 5.15.x lowlatency
- Portreve
- Level 13
- Posts: 4870
- Joined: Mon Apr 18, 2011 12:03 am
- Location: Within 20,004 km of YOU!
- Contact:
Re: Building a stable server for Linux compiling
I can't really offer comment on a compiling system, but I can tell you that my preferred server OS is Debian because it will outlive the hardware it runs on.
I've used it in exactly that capacity on two very dated (at that point, much less today) computers: a PowerMacintosh G3/266 Desktop, and a Mac mini G4 1.42 GHz. I ran the PowerMac G3 like that for about 3 1/2 years, and the Mac mini for almost 5 (IIRC) and what attracted me to running Debian, apart from the fact they (used to) make ports for everything including your kitchen toaster, was its reputation for stability. I can tell you for a fact those machines ran, one after the other, for the time stated above, and never crashed. Ever.
I've used it in exactly that capacity on two very dated (at that point, much less today) computers: a PowerMacintosh G3/266 Desktop, and a Mac mini G4 1.42 GHz. I ran the PowerMac G3 like that for about 3 1/2 years, and the Mac mini for almost 5 (IIRC) and what attracted me to running Debian, apart from the fact they (used to) make ports for everything including your kitchen toaster, was its reputation for stability. I can tell you for a fact those machines ran, one after the other, for the time stated above, and never crashed. Ever.
Flying this flag in support of freedom 🇺🇦
Recommended keyboard layout: English (intl., with AltGR dead keys)
Podcasts: Linux Unplugged, Destination Linux
Also check out Thor Hartmannsson's Linux Tips YouTube Channel
Recommended keyboard layout: English (intl., with AltGR dead keys)
Podcasts: Linux Unplugged, Destination Linux
Also check out Thor Hartmannsson's Linux Tips YouTube Channel
Re: Building a stable server for Linux compiling
FWIW, for anyone who actually knows about programming, the stability of the underlying system is the least of your concerns. The stability of YOUR CODE is going to be the problem. Programming is actually platform agnostic.
For every complex problem there is an answer that is clear, simple, and wrong - H. L. Mencken
Re: Building a stable server for Linux compiling
Muddying the thread but I remember you said this before and that I commented on it before: you may wish to make precise what you mean by that statement, because any non-trivial/non-internal code tends to be anything but platform agnostic.
-
- Level 4
- Posts: 270
- Joined: Sat Dec 19, 2020 8:53 am
Re: Building a stable server for Linux compiling
My experience with Ryzen is a mixed bag, most of the issues are BIOS(UEFI) related and usually flashing the newest version resolved the issues, i had a customer with a Windows Server that under heavy load would lockup or BSOD, and guess what? BIOS issues.
My now broken laptop wouldn't even boot any linux, BIOS update fixed.
My now broken laptop wouldn't even boot any linux, BIOS update fixed.
Terminal - zsh wrote: ╭─legacy@forums.linuxmint.com
╰─➜ _
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
Interestingly you caught that. The Debian 12 is just for playing around on, someone asked me if it where possible to load Mint 20.3 packages into Debian. I will obviously reload to Debian 11. I use Mint as a tool to help new users who wish to come to Linux and learn. Windows is where I am seeing the majority of my crashes on the 5950X and that is a problem because I use that computer for my professional work in Autocad, Solidworks and Mastercam. After talking with several people in the professional server realm, it was recommended that I upgrade to ECC unbuffered memory on my machine. I purchased 64Gb of 3200, but I think I am going to return this kit and get the 128. Hopefully this helps my stability issue. I have upgraded all firmware I could find. Wendel from LV1 tech says that it could be any number of issues, even DC power routed too close to the memory.DisturbedDragon wrote: ⤴Mon Dec 06, 2021 7:46 pm Just throwing this out there. Your sig shows " Deb12+Mint20.3 packages - 5.14.0-4". Maybe this is part of the stability issue? I mention it only because the system in my sig runs current Mint Cinnamon, current Fedora Cinnamon, openSUSE and other distros (my boot menu is ridiculous) and all are rock solid. Among many other things, I compile quite a bit and have not experienced any issues.
Not to say there is not an issue, just not sure the issue is hardware related.
There are different levels of stability from different viewpoints I guess. I have just an entry level workstation computer which seems to work great for the money spent. Obviously an Epyc or Xeon system is far more expensive. But with that expense a more stable platform and other things like better memory bandwidth and support. I looked into the newer Threadrippers and for $4000 for a cpu, I would buy an Epyc 64 core for $5000 as it also comes with the motherboard at that price point. You have the scaleability to run two cpus if needed.
Having better code does help your compile from crashing. We found this out yesterday during live stream when Chris Titus was trying to compile Debian sid into a live iso. I don't even know if it is possible to do that. I think you have to build a bullseye iso and then change the sources. I actually wrote tutorials on this, but they were all deleted by xenopeek.
-
- Level 5
- Posts: 574
- Joined: Mon Oct 29, 2012 6:29 pm
- Location: Texas
Re: Building a stable server for Linux compiling
Given the issues perisists across operating systems/distros and notably during higher workloads for the most part; what kind of power are you pushing on this hardware? Could very well be a power supply issue. I had a server (dual 16 core Opterons w/256GB ECC) that kept crashing randomly a little over a year ago. Thought it was CPU, board, RAM or failing drive. Funnily I did test the PSU and it tested out fine. Ran memtest, swapped CPU, checked board with a magnifying glass looking for issues, Nothing but crash, crash, crash, reboot. I swapped out the power supply just for giggles. System has been up for a year now...Michael_Hathaway wrote: ⤴Wed Dec 08, 2021 6:14 am Interestingly you caught that. The Debian 12 is just for playing around on, someone asked me if it where possible to load Mint 20.3 packages into Debian. I will obviously reload to Debian 11. I use Mint as a tool to help new users who wish to come to Linux and learn. Windows is where I am seeing the majority of my crashes on the 5950X and that is a problem because I use that computer for my professional work in Autocad, Solidworks and Mastercam. After talking with several people in the professional server realm, it was recommended that I upgrade to ECC unbuffered memory on my machine. I purchased 64Gb of 3200, but I think I am going to return this kit and get the 128. Hopefully this helps my stability issue. I have upgraded all firmware I could find. Wendel from LV1 tech says that it could be any number of issues, even DC power routed too close to the memory.
Perhaps... Program crashes are unacceptable for me, much less system crashes.Michael_Hathaway wrote: ⤴Wed Dec 08, 2021 6:14 am There are different levels of stability from different viewpoints I guess.
AMD Ryzen 9 5950X 16C/32T | MSI MPG x570 Gaming Plus | 2TB Mushkin Pilot-E NVMe | 1TB Crucial P1 NVMe | 2x 2TB Inland Gen4 NVMe | 32GB Trident Z DDR4 3600 | Nvidia RTX4090 | Fedora 39 Cinnamon | Linux Mint 21.3 Cinnamon | Kernel 5.15.x lowlatency
Re: Building a stable server for Linux compiling
That seems odd to me, but then again I AM an AMD fanboy. I am running a gen 1 Ryzen 1700 and it has been superb. Now at the beginning I did have lot of issues with the Gigabyte motherboard (their tech support is horrible), once I changed to an ASUS Crosshair VI Hero, it has been rock solid. From your description I would suspect either motherboard or ram issues. If I could afford one, I would absolutely get a threadripper 32 core system. Do I need one? Absolutely not. Do I WANT one? Yes. Please keep us posted on the issues, this thread is interesting.
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
That is a good point you made. Yes, a Opteron system crashing like that is very odd. I am used to working on dual Xeon workstations, so you see my reference point to building this system and having crashing. Most of the crashing is happening when either suspending to ram or when I have large workloads in ram. This is what led me to believe it is either an incompatibility with my ram or the pathways to that ram on the motherboard.DisturbedDragon wrote: ⤴Wed Dec 08, 2021 7:46 amGiven the issues persists across operating systems/distros and notably during higher workloads for the most part; what kind of power are you pushing on this hardware? Could very well be a power supply issue. I had a server (dual 16 core Opterons w/256GB ECC) that kept crashing randomly a little over a year ago. Thought it was CPU, board, RAM or failing drive. Funnily I did test the PSU and it tested out fine. Ran memtest, swapped CPU, checked board with a magnifying glass looking for issues, Nothing but crash, crash, crash, reboot. I swapped out the power supply just for giggles. System has been up for a year now...
I will check the power supply. I purchased an Evga Platinum 80, 850W power supply. This is overkill in the amount of power that is needed, but, it might be faulty. I recently tested the PowerBoostOverdirve setting to see if I had enough cooling. Forced full load resulted in a cpu temperature of 91C. I decided to turn PBO off. For those not familiar what this does, is it allows single core boost to sustain longer time periods by allowing more voltage and current to go to the cpu. The reason I bring this up is that the higher current sent to the processor did not change the amount of instability that I observed, only the increase in cpu temperature.
AMD, the processors are basically set to their maximum performance with stability out of the box. Any changes to this will boost performance at the cost of stability. And I want stability over performance, so I will leave this in its out of the box, stock form.
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
I am thinking that you are correct, and it is motherboard and memory also. Your Crosshair VI Hero is a X370 chipset and I needed the X570. I did try to buy the Asus X570 board, but it is not available, and even the used ones are selling for $750.MurphCID wrote: ⤴Wed Dec 08, 2021 8:07 am That seems odd to me, but then again I AM an AMD fanboy. I am running a gen 1 Ryzen 1700 and it has been superb. Now at the beginning I did have lot of issues with the Gigabyte motherboard (their tech support is horrible), once I changed to an ASUS Crosshair VI Hero, it has been rock solid. From your description I would suspect either motherboard or ram issues. If I could afford one, I would absolutely get a threadripper 32 core system. Do I need one? Absolutely not. Do I WANT one? Yes. Please keep us posted on the issues, this thread is interesting.
At this point, I ordered a new Aorus Xtreme motherboard which I found new for $600 and I have ordered 128Gb of ECC 3200. Not all X570 boards will run ECC so I am limited. The Aorus Xtreme uses heat piping to cool the X570, vs the Master I have now has a gigantic heatsink. I also ordered a new rack case which I plan to hole-saw two additional 120mm fan mounts on the port side. I will have a total of 3 - high pressure 120mm Noctua inlet fans and 1 80mm.
As for Threadripper, in my opinion, is a foolish purchase. The difference between the 5950X (16/32core, $700) and the 3970X (32/64core, $3000), basically thread count power. A Chromium compile on the 5950X takes exactly 10 minutes longer than the 3970X, that is not twice as long, there are huge diminishing returns here. And even if you did justify the +$2300.00 more you are spending, why would you, when you can buy an Epyc for about the same money, have 8 memory channels. Yes the Threadripper has a higher clock speed, but it doesn't compile any faster than the Epyc because it is limited to 4 memory channels. So the real difference is that you get a faster single core speed on the Threadripper at the cost of a lot more power consumption and less stability. But, you can awe all your buddies, saying you have a Threadripper, except for the ones that have Epyc.
Re: Building a stable server for Linux compiling
I had the top of the line Gigabyte X370 since I only buy top of the line motherboards (less issues normally, plus lots of ports), and the Gigabyte was awful, I was constantly having to install the latest BIOS they were churning out to fix issues, until it finally died one day, and corrupted my NVME drive. I got the ASUS and have never looked back. Also when the Gigabyte died it took a ram stick with it. Gigabyte insisted it was my fault (I do not over clock, etc, I run stock). Then after they "fixed" it they would never tell me what was wrong. I will never purchase another GIgabyte product. Also once I went to G.Skill Trident Z RAM at 2666mhz I have not had a single issue. I never run ram overclocked. I always run the max supported.Michael_Hathaway wrote: ⤴Thu Dec 09, 2021 6:09 amI am thinking that you are correct, and it is motherboard and memory also. Your Crosshair VI Hero is a X370 chipset and I needed the X570. I did try to buy the Asus X570 board, but it is not available, and even the used ones are selling for $750.MurphCID wrote: ⤴Wed Dec 08, 2021 8:07 am That seems odd to me, but then again I AM an AMD fanboy. I am running a gen 1 Ryzen 1700 and it has been superb. Now at the beginning I did have lot of issues with the Gigabyte motherboard (their tech support is horrible), once I changed to an ASUS Crosshair VI Hero, it has been rock solid. From your description I would suspect either motherboard or ram issues. If I could afford one, I would absolutely get a threadripper 32 core system. Do I need one? Absolutely not. Do I WANT one? Yes. Please keep us posted on the issues, this thread is interesting.
At this point, I ordered a new Aorus Xtreme motherboard which I found new for $600 and I have ordered 128Gb of ECC 3200. Not all X570 boards will run ECC so I am limited. The Aorus Xtreme uses heat piping to cool the X570, vs the Master I have now has a gigantic heatsink. I also ordered a new rack case which I plan to hole-saw two additional 120mm fan mounts on the port side. I will have a total of 3 - high pressure 120mm Noctua inlet fans and 1 80mm.
My ASUS system has been running stable since July 2017. I was an early adopter. The funny thing is the the 1700 is more than enough processor for me, I truly do not need any more than what I have. I will build a retirement system sometime early next year, still working out the specifications. Most likely either an 8 or 12 core Ryzen 5000 series with 32 or 64 gb of RAM.
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
I have built just about every manufacture brand there is. You are going to run into issues no matter the brand name.
I think you have a good plan for your retirement system. The 5900X is a better cpu for most people than the 5950. It's half the cost, less heat and nearly the same speed. The Threadripper/Epyc is a peripherals game, as AMD has made Threadripper hardware more attractive. Xeon however, you have a much larger selection of motherboards for example, but at a higher cost. The new 10 and 7nm Xeons are looking promising. I almost purchased a W3375 (38/76 core 4Ghz) xeon 10nm, with 8 channel memory. But I was told to wait for Intel to release it's new ddr5 for xeon. Penguin computing was able to put 7,626 cores into a single rack using xeon 9200 cpus (Relion XO1122eAP server). Competition is a good thing, but I think that AMD is poking the bear.
I think you have a good plan for your retirement system. The 5900X is a better cpu for most people than the 5950. It's half the cost, less heat and nearly the same speed. The Threadripper/Epyc is a peripherals game, as AMD has made Threadripper hardware more attractive. Xeon however, you have a much larger selection of motherboards for example, but at a higher cost. The new 10 and 7nm Xeons are looking promising. I almost purchased a W3375 (38/76 core 4Ghz) xeon 10nm, with 8 channel memory. But I was told to wait for Intel to release it's new ddr5 for xeon. Penguin computing was able to put 7,626 cores into a single rack using xeon 9200 cpus (Relion XO1122eAP server). Competition is a good thing, but I think that AMD is poking the bear.
- Michael_Hathaway
- Level 4
- Posts: 313
- Joined: Sat Oct 09, 2021 2:27 am
- Location: Shebang, USA
- Contact:
Re: Building a stable server for Linux compiling
Only half of my memory arrived today. I Installed 64Gb of Ecc 3200, everything is working solid now 3211Mhz. I did not have to adjust any settings, motherboard did take extra time to post like all servers seem to do. There are no XMP profiles on ECC. Cpu temperature went down 3 degrees under full load and idle is down 7 degrees.