My Asus Vivobook K571GT dual booting in Ubuntu 20.04 is recently started shutting down due to high temperature (reaching 99c+). These temperature are reached only when the laptop is plugged in.
The BIOS is updated to the latest, Ubuntu updated to the latest kernel. I've seen it might be due to nvidia driver not installed properly, so I tried a bunch of different nvidia drivers (460, 470 & 495). Tried disabling nvdia altogether running only with the integrated GPU. They all had the same results, when plugged in the temperature spike from a respectable 40c-45c to 95c in a second (without that much CPU load, i.e. running the apt update command will make the CPU temperature rise to 90c+), if I don't stop what I am doing or a command is running & I can't stop it in time the CPU will hit the 100c mark which trigger the shutdown. Interestingly if I unplugged while I get a high temperature warning the temperature goes back down to 45-50c in a second.
Has anyone experience something similar? The only thing I can think of for the rapid CPU temperature spike when plugged in but not on battery is the CPU getting "overclocked" when somehow. I'm not sure how I can verify this & if it somehow does how to prevent this from happening? An hardware issue like the AC adapter providing too much power?
Edit
grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver
/sys/devices/system/cpu/cpu0/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu10/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu11/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu1/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu2/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu3/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu4/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu5/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu6/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu7/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu8/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu9/cpufreq/scaling_driver:intel_pstategrep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu10/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu11/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu1/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu2/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu3/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu4/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu5/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu6/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu7/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu8/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu9/cpufreq/scaling_governor:powersavegrep "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i7-9750H CPU @ 2.60GHzcat /sys/devices/system/cpu/intel_pstate/no_turbo
0Edit
ps auxc | grep -i therm
root 167 0.0 0.0 0 0 ? I< 10:18 0:00 acpi_thermal_pm
root 1049 0.0 0.0 128808 9456 ? Ssl 10:18 0:00 thermaldsudo dmidecode -s bios-version
X571GT.311ls -al /etc/thermald
total 28
drwxr-xr-x 2 root root 4096 Sep 8 13:48 .
drwxr-xr-x 148 root root 12288 Nov 2 12:01 ..
-rw-r--r-- 1 root root 4605 Jan 14 2019 thermal-conf.xml
-rw-r--r-- 1 root root 508 Jan 14 2019 thermal-cpu-cdev-order.xmlThe laptop is just a year or two old. The latest BIOS update was release just a couple of weeks ago.
cat /etc/thermald/thermal-conf.xml
<?xml version="1.0"?>
<!--
use "man thermal-conf.xml" for details
-->
<!-- BEGIN -->
<ThermalConfiguration>
<Platform> <Name>Generic X86 Laptop Device</Name> <ProductName>EXAMPLE_SYSTEM</ProductName> <Preference>QUIET</Preference> <ThermalSensors> <ThermalSensor> <Type>TSKN</Type> <AsyncCapable>1</AsyncCapable> </ThermalSensor> </ThermalSensors> <ThermalZones> <ThermalZone> <Type>SKIN</Type> <TripPoints> <TripPoint> <SensorType>TSKN</SensorType> <Temperature>55000</Temperature> <type>passive</type> <ControlType>SEQUENTIAL</ControlType> <CoolingDevice> <index>1</index> <type>rapl_controller</type> <influence> 100 </influence> <SamplingPeriod> 16 </SamplingPeriod> </CoolingDevice> <CoolingDevice> <index>2</index> <type>intel_powerclamp</type> <influence> 100 </influence> <SamplingPeriod> 12 </SamplingPeriod> </CoolingDevice> </TripPoint> </TripPoints> </ThermalZone> </ThermalZones>
</Platform>
<!-- Thermal configuration example only -->
<Platform> <Name>Example Platform Name</Name> <!--UUID is optional, if present this will be matched --> <!-- Both product name and UUID can contain wild card "*", which matches any platform --> <UUID>Example UUID</UUID> <ProductName>Example Product Name</ProductName> <Preference>QUIET</Preference> <ThermalSensors> <ThermalSensor> <!-- New Sensor with a type and path --> <Type>example_sensor_1</Type> <Path>/some_path</Path> <AsyncCapable>0</AsyncCapable> </ThermalSensor> <ThermalSensor> <!-- Already present in thermal sysfs, enable this or add/change config For example, here we are indicating that sensor can do async events to avoid polling --> <Type>example_thermal_sysfs_sensor</Type> <!-- If async capable, then we don't need to poll --> <AsyncCapable>1</AsyncCapable> </ThermalSensor> <ThermalSensor> <!-- Examle of a virtual sensor. This sensor depends on other real sensor or virtual sensor. E.g. here the temp will be temp of example_sensor_1 * 0.5 + 10 --> <Type>example_virtual_sensor</Type> <Virtual>1</Virtual> <SensorLink> <SensorType>example_sensor_1</SensorType> <Multiplier> 0.5 </Multiplier> <Offset> 10 </Offset> </SensorLink> </ThermalSensor> </ThermalSensors> <ThermalZones> <ThermalZone> <Type>Example Zone type</Type> <TripPoints> <TripPoint> <SensorType>example_sensor_1</SensorType> <!-- Temperature at which to take action --> <Temperature> 75000 </Temperature> <!-- max/passive/active If a MAX type is specified, then daemon will use PID control to aggresively throttle to avoid reaching this temp. --> <type>max</type> <!-- SEQUENTIAL | PARALLEL When a trip point temp is violated, then number of cooling device can be activated. If control type is SEQUENTIAL then It will exhaust first cooling device before trying next. --> <ControlType>SEQUENTIAL</ControlType> <CoolingDevice> <index>1</index> <type>example_cooling_device</type> <!-- Influence will be used order cooling devices. First cooling device will be used, which has highest influence. --> <influence> 100 </influence> <!-- Delay in using this cdev, this takes some time too actually cool a zone --> <SamplingPeriod> 12 </SamplingPeriod> </CoolingDevice> </TripPoint> </TripPoints> </ThermalZone> </ThermalZones> <CoolingDevices> <CoolingDevice> <!-- Cooling device can be specified by a type and optionally a sysfs path If the type already present in thermal sysfs no need of a path. Compensation can use min/max and step size to increasing cool the system. Debounce period can be used to force a waiting period for action --> <Type>example_cooling_device</Type> <MinState>0</MinState> <IncDecStep>10</IncDecStep> <ReadBack> 0 </ReadBack> <MaxState>50</MaxState> <DebouncePeriod>5000</DebouncePeriod> <!-- If there are no PID parameter compensation increase step wise and exponentaially if single step is not able to change trend. Alternatively a PID parameters can be specified then next step will use PID calculation using provided PID constants. -->> <PidControl> <kp>0.001</kp> <kd>0.0001</kd> <ki>0.0001</ki> </PidControl> </CoolingDevice> </CoolingDevices>
</Platform>
</ThermalConfiguration>
<!-- END -->top
top - 13:16:27 up 1:37, 1 user, load average: 0.85, 1.32, 1.11
Tasks: 487 total, 2 running, 484 sleeping, 1 stopped, 0 zombie
%Cpu(s): 5.1 us, 2.0 sy, 1.5 ni, 90.6 id, 0.1 wa, 0.0 hi, 0.7 si, 0.0 st
GiB Mem : 15.5 total, 4.5 free, 5.0 used, 5.9 buff/cache
GiB Swap: 2.0 total, 2.0 free, 0.0 used. 10.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 35883 root 39 19 84636 68132 12616 R 19.8 0.4 0:00.60 apt-check 4842 haleks 20 0 4487900 483220 120988 S 2.6 3.0 1:49.49 gnome-shell 7291 haleks 20 0 923372 60172 45804 S 2.3 0.4 1:34.25 psensor 32705 haleks 20 0 24.5g 130676 77652 S 2.3 0.8 0:14.20 brave 975 message+ 20 0 40380 34872 4068 S 1.0 0.2 0:31.14 dbus-daemon 1002 root 20 0 2332860 32620 16456 S 1.0 0.2 0:05.98 snapd 4555 haleks 20 0 24.7g 147872 79744 S 1.0 0.9 1:10.25 Xorg 5229 haleks 20 0 2258744 131912 45796 S 1.0 0.8 1:16.97 keybase 35782 root 20 0 287276 16044 14104 S 1.0 0.1 0:00.03 packagekitd 663 root -51 0 0 0 0 S 0.7 0.0 0:38.09 irq/152-nvidia 21473 haleks 20 0 819496 53768 39012 S 0.7 0.3 0:07.86 gnome-terminal- 32564 haleks 20 0 16.6g 410380 190120 S 0.7 2.5 0:42.65 brave 32596 haleks 20 0 16.6g 182632 87372 S 0.7 1.1 0:47.20 brave 34076 root 20 0 25368 13280 7900 S 0.7 0.1 0:00.16 apt 357 root 19 -1 68944 30764 29000 S 0.3 0.2 0:01.12 systemd-journal 387 root 20 0 24164 7796 4236 S 0.3 0.0 0:02.20 systemd-udevd 517 root -51 0 0 0 0 S 0.3 0.0 0:00.73 irq/148-iwlwifi 992 root 20 0 235188 10276 6928 S 0.3 0.1 0:02.17 polkitd 1065 root 20 0 716580 12360 9072 S 0.3 0.1 0:01.60 canonical-livep 1349 gdm 20 0 317300 9004 7968 S 0.3 0.1 0:00.28 goa-identity-se 1864 root 20 0 2432052 150584 31964 S 0.3 0.9 0:07.40 lxd 4545 haleks 20 0 8748 5860 4012 S 0.3 0.0 0:01.37 dbus-daemon 5448 haleks 20 0 2370936 172572 33964 S 0.3 1.1 0:27.26 kbfsfuse 7473 haleks 20 0 503408 143448 66476 S 0.3 0.9 0:35.84 Keybase 7575 haleks 20 0 463344 40076 32528 S 0.3 0.2 0:00.39 update-notifier 10111 haleks 20 0 582224 166968 80480 S 0.3 1.0 0:37.21 gitkraken 32662 haleks 20 0 24.4g 121680 81520 S 0.3 0.7 0:03.68 brave 35783 root 20 0 24164 5228 1652 S 0.3 0.0 0:00.01 systemd-udevd 35784 root 20 0 24164 5228 1652 S 0.3 0.0 0:00.01 systemd-udevd 35786 root 20 0 24164 5228 1652 S 0.3 0.0 0:00.01 systemd-udevd 1 root 20 0 168176 12092 8296 S 0.0 0.1 0:08.88 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-kblockd 9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq 10 root 20 0 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 11 root 20 0 0 0 0 I 0.0 0.0 0:09.66 rcu_sched 12 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/0 13 root -51 0 0 0 0 S 0.0 0.0 0:00.00 idle_inject/0 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1 16 root -51 0 0 0 0 S 0.0 0.0 0:00.00 idle_inject/1 17 root rt 0 0 0 0 S 0.0 0.0 0:00.18 migration/1 18 root 20 0 0 0 0 S 0.0 0.0 0:00.06 ksoftirqd/1 10 1 Answer
Your /etc/thermald/thermal-conf.xml is incorrect. It's two example files tacked together.
Try this somewhat generic .xml file shown below.
Note: You may end up customizing the following line...
<Temperature>60000</Temperature>Then restart thermald with:
sudo systemctl restart thermald
<?xml version="1.0"?>
<ThermalConfiguration> <Platform> <Name>Override CPU default passive</Name> <ProductName>*</ProductName> <Preference>QUIET</Preference> <ThermalZones> <ThermalZone> <Type>cpu</Type> <TripPoints> <TripPoint> <Temperature>60000</Temperature> <type>passive</type> </TripPoint> </TripPoints> </ThermalZone> </ThermalZones> </Platform>
</ThermalConfiguration> 14