Instant thermal optimization

lightrush@lemmy.ca · edit-2 27 days ago

While true for the component itself, there’s material difference for any caps surrounding it. Sure the chipset would work fine at 40, 50, 70°C. However electrolytic capacitors lifespan is halved with every 10°C temperature increase. From a brief search it seems solid caps also crap out much faster at higher temps but can outlast electrolytic at lower temps. This is a consideration for a long lifespan system. The one in my case is expected to operate till 2032 or beyond.

I don’t think other components degrade in any significant fashion whether they run at 40 or 60°C.

lightrush@lemmy.ca · 27 days ago

Unfortunately I didn’t take before/after measurements but this thick plastic sheet cannot be good for the chipset thermals. 🥲

lightrush@lemmy.ca · 28 days ago

Instant thermal optimization

lightrush@lemmy.ca · edit-2 2 months ago

It is. I just wish it wasn’t this expensive. Will have to live with it for a while. 😅

lightrush@lemmy.ca · 2 months ago

So generally Pegatron. :D I used to buy GB because it was made in Taiwan when ASUS became Pegatron and went to China. Their quality decreased. GB used to put high quality components on their boards in comparison. But now GB is also made somewhere in the PRC. I’ve no idea where MSI are in terms of quality. We used to make fun of them using the worst capacitors back in the 90s/00s. Looking at their Newegg reviews, their 1-star ratings seem lower proportion compared to Pegatron brands and GB. Maybe they’re nicer these days? The X570 replacement I got for this machine is an ASUS - “TUF” 🙄

lightrush@lemmy.ca · 2 months ago

Signature Plastics G20

They feel a bit like a mix between DSA and laptop keycaps.

lightrush@lemmy.ca · 2 months ago

What do you buy?

lightrush@lemmy.ca · edit-2 2 months ago

I think I found the source of the liquid @abcdqfr@lemmy.world. The thermal pad under the VRM heatsink has begun to liquefy into oily substance. This substance appears to have gone to the underside of the board through the vias around the VRM and discolored itself.

Some rubbing with isopropyl alcohol and it’s almost gone:

Perhaps there’s still life left in this board if used with an older chip.

lightrush@lemmy.ca · 2 months ago

I think the board has reached the end of the road. 😅

lightrush@lemmy.ca · edit-2 2 months ago

Hard to say. She’s been in 24/7 service since 2017. Never had stability issues and I’ve tested it with Prime95 plenty of times upon upgrades. Last week I ran a Llama model and the computer froze hard. Even holding the power button wouldn’t turn it off. Did the PSU power flip, came back up. Prime95 stable. Llama -> rip. Perhaps it’s been cooked for a while and only trips by this workload. She’s an old board, a Gigabyte with B350 running a 5950X (for a couple of years), so it’s not super surprising that the power section has been a bit overused. 😅 Replacing with an X570 as we speak.

lightrush@lemmy.ca · 2 months ago

Funny enough, I can’t detect the smell from hell. Could be COVID.

lightrush@lemmy.ca · 2 months ago

Does this under VRM mean high FPS?

lightrush@lemmy.ca · edit-2 2 months ago

Iiinteresting. I’m on the larger AB350-Gaming 3 and it’s got REV: 1.0 printed on it. No problems with the 5950X so far. 🤐 Either sheer luck or there could have been updated units before they officially changed the rev marking.

lightrush@lemmy.ca · edit-2 2 months ago

On paper it should support it. I’m assuming it’s the ASRock AB350M. With a certain BIOS version of course. What’s wrong with it?

lightrush@lemmy.ca · edit-2 2 months ago

B350 isn’t a very fast chipset to begin with

For sure.

I’m willing to bet the CPU in such a motherboard isn’t exactly current-gen either.

Reasonable bet, but it’s a Ryzen 9 5950X with 64GB of RAM. I’m pretty proud of how far I’ve managed to stretch this board. 😆 At this point I’m waiting for blown caps, but the case temp is pretty low so it may end up trucking along for surprisingly long time.

Are you sure you’re even running at PCIe 3.0 speeds too?

So given the CPU, it should be PCIe 3.0, but that doesn’t remove any of the queues/scheduling suspicions for the chipset.

I’m now replicating data out of this pool and the read load looks perfectly balanced. Bandwidth’s fine too. I think I have no choice but to benchmark the disks individually outside of ZFS once I’m done with this operation in order to figure out whether any show problems. If not, they’ll go in the spares bin.

lightrush@lemmy.ca · 2 months ago

I put the low IOPS disk in a good USB 3 enclosure, hooked to an on-CPU USB controller. Now things are flipped:

                                        capacity     operations     bandwidth 
pool                                  alloc   free   read  write   read  write
------------------------------------  -----  -----  -----  -----  -----  -----
storage-volume-backup                 12.6T  3.74T      0    563      0   293M
  mirror-0                            12.6T  3.74T      0    563      0   293M
    wwn-0x5000c500e8736faf                -      -      0    406      0   146M
    wwn-0x5000c500e8737337                -      -      0    156      0   146M

You might be right about the link problem.

Looking at the B350 diagram, the whole chipset is hooked via PCIe 3.0 x4 link to the CPU. The other pool (the source) is hooked via USB controller on the chipset. The SATA controller is also on the chipset so it also shares the chipset-CPU link. I’m pretty sure I’m also using all the PCIe links the chipset provides for SSDs. So that’s 4GB/s total for the whole chipset. Now I’m probably not saturating the whole link, in this particular workload, but perhaps there’s might be another related bottleneck.

lightrush@lemmy.ca · 2 months ago

Turns out the on-CPU SATA controller isn’t available when the NVMe slot is used. 🫢 Swapped SATA ports, no diff. Put the low IOPS disk in a good USB 3 enclosure, hooked to an on-CPU USB controller. Now things are flipped:

                                        capacity     operations     bandwidth 
pool                                  alloc   free   read  write   read  write
------------------------------------  -----  -----  -----  -----  -----  -----
storage-volume-backup                 12.6T  3.74T      0    563      0   293M
  mirror-0                            12.6T  3.74T      0    563      0   293M
    wwn-0x5000c500e8736faf                -      -      0    406      0   146M
    wwn-0x5000c500e8737337                -      -      0    156      0   146M

lightrush@lemmy.ca · edit-2 2 months ago

Interesting. SMART looks pristine on both drives. Brand new drives - Exos X22. Doesn’t mean there isn’t an impending problem of course. I might try shuffling the links to see if that changes the behaviour on the suggestions of the other comment. Both are currently hooked to an AMD B350 chipset SATA controller. There are two ports that should be hooked to the on-CPU SATA controller. I imagine the two SATA controllers don’t share bandwidth. I’ll try putting one disk on the on-CPU controller.

lightrush@lemmy.ca · edit-2 2 months ago

Mirror seeing half the write IOPS on one disk than the other, is this normal?

lightrush@lemmy.ca · 4 months ago

Don’t let be called a hypocrite - give $5. 😆

lightrush@lemmy.ca · edit-2 6 months ago

Lenovo ThinkCentre / Dell OptiPlex USFF machine like the M710q.
Secondary NVMe or SATA SSD for a RAID1 mirror
- Use LVMRAID for this. It uses mdraid underneath but it’s easier to manage
External USB disks for storage
- WD Elements generally work well when well ventilated
- OWC Mercury Elite Pro Quad has a very well implemented USB path and has been problem-free in my testing
Debian / Ubuntu LTS
ZFS for the disk storage
Backups may require a second copy or similar of this setup so keep that in mind when thinking about the storage space and cost

Here’s a visual inspiration:

lightrush@lemmy.ca · 7 months ago

Yes, yes I would use ZFS if I had only one file on my disk.

lightrush@lemmy.ca · 7 months ago

OK, I think it may have to do with the odd number of data drives. If I create a raidz2 with 4 of the 5 disks, even with ashift=12, recordsize=128K, the performance in sequential single thread read is stellar. What’s not clear is why this doesn’t affect, or not as much, the 4x 8TB-drive raidz1.