-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ELevate 0.21.0 #124
ELevate 0.21.0 #124
Conversation
Dropping upgrade paths related to following releases: 8.6, 8.9, 9.0, 9.3. See the previous commit for more info. During the drop of these release, I've realized the current structure of tests is not suitable for such operations as current test/job definitions has been chained. So e.g. tests for 8.10 -> 9.4 depended on 8.9 -> 9.3, which depended on 8.8 -> 8.6, etc... Even going deeper, IPU 8->9 definitions have been based on 7 -> 8 definitions. So I updated the structure, separating tests for IPU 7 -> 8 and 8 -> 9 and also deps between all upgrade paths. Now, particular tests can inherit one of *abstract* jobs definitions, so dropping or removing tests for an upgrade path does not affect other tests. Also fixed some incorrect definitions in tests, like a fixed label for `beaker-minimal-88to92` (orig "8.6to9.2"). Update welcome-PR bot msg to reflect changes in upgrade paths. Jira: OAMG-10451 (cherry picked from commit b875ae2)
yield from cannot be used until we require python3.3 or greater (cherry picked from commit db8a0cf)
(cherry picked from commit 214ed9b)
Original solution expected always ``key: val`` pair on each line. However, it has not been expected that val could be actually empty string, which would lead to situation where the following line is interpreted as a value. The new solution updates the parsing for output on RHEL 7, but also calls newly ``lscpu -J`` on RHEL 8+ to obtain data in the JSON format, which drops all possible parsing problems from our side. Fixes oamg#1182 (cherry picked from commit 050620e)
* Move actor's process to its library * Add check for run in process * Create file with short output of 'udevamd info -e' for testing purposes * Add unit tests for actor Jira: OAMG-1277 (cherry picked from commit b65ef94)
Jira: OAMG-10367 (cherry picked from commit 3066cad)
The Package class has custom __hash__ and __eq__ methods in order to achieve a straightforward presentation via set manipulation. However, this causes problems, e.g., when applying split events. For example: Applying the event Split(in={(A, repo1)}, out={(A, repo2), (B, repo2)}) to the package state {(A, repo1), (B, repo1)} results in the following: {(A, repo1), (B, repo1)} --apply--> {(A, repo2), (B, repo1)} which is undesired as repo1 is a source system repository. Such a package will get reported to the user as potentially removed during the upgrade. This patch addresses this unwanted behavior. (cherry picked from commit 8207078)
Previously, when the upgrade failed in the initram the file /sysroot/root/tmp_leapp_py3/.leapp_upgrade_failed has been generated and upon detecting this file leapp triggered an emergency shell. This caused the original failure to be hidden from the customer. With this commit, we no longer crash immediately upon detecting the file but rather continue and "wait" for the underlying issue and error to emerge. RHEL-24148 (cherry picked from commit 1f8b8f3)
The "A reboot is required to continue. Please reboot your system." message is printed before the reports summary and thus is easily overlooked by users. This patch adds a second such message after the report summary to improve this. Jira: RHEL-22736 (cherry picked from commit 1fb7e78)
Adds a new actor checking for whether any of the GRUB devices have the old GRUB Legacy installed. If any of such devices is detected, the upgrade is inhibited. The GRUB Legacy is detected by searching for the string 'GRUB version 0.94' in `file -s` of the device. (cherry picked from commit 8fe2a2d)
Originally we tried to map by default repositories from particular channels on the source system to their equivalents on the target system. IOW: * eus -> eus * aus -> aus * e4s -> e4s * "ga" -> "ga" ... However, it has been revealed this logic should not apply on minor releases for which these non-ga (premium) repositories do not exist. So doing upgrade e.g. to 8.9, 8.10 , 9.3 for which specific eus, etc.. repositories are not defined lead to 404 error. Discussing this deeply between stakeholders, it has been decided to drop this logic and target always to "ga" repositories unless the leapp is executed with instructions to choose a different channel (using envars, --channel .. option). To prevent this issue. It's still possible to require mistakenly e.g. "eus" channel for the target release for which the related repositories are not defined. e.g.: > leapp upgrade --channel eus --target 8.10 In such a case, the previous errors (404 Not found) can be hit again. But it will not happen by default. In this case, we expect and request people to understand what they want when they use the option. @pirat89 : Updated commit msg jira: RHEL-24720 (cherry picked from commit a4e3906)
The new message informs the useir will happen (they will boot into the old RHEL's kernel) so they understand why they need to run the remediation. jira: https://issues.redhat.com/browse/RHEL-29683 (cherry picked from commit 6d05575)
Check that the first partition starts at least at 1MiB (2048 cylinders), as too small first-partition offsets lead to failures when doing grub2-install. The limit (1MiB) has been chosen as it is a common value set by the disk formatting tools nowadays. jira: https://issues.redhat.com/browse/RHEL-3341 (cherry picked from commit ea6cd79)
This is extension of the previous commit. The original problem that we are trying to resolve is to be sure the embedding area (MBR gap) has expected size. This is irrelevant in case of GPT partition table is used on a device. The fdisk output format is in case of GPT disk label different, which breaks the parsing, resulting in empty list of partitions in related GRUBDevicePartitionLayout msg. For now, let's skip produce of msgs for "GPT devices". As a seatbelt, ignore processing of messages with empty partitions field, expecting that such a device does not contain MBR. We want to prevent false positive inhibitors (and FP blocking errors). We expect that total number of machines with small embedding area is very minor in total numbers, so even if we would miss something (which is not expected now to our best knowledge) it's still good trade-off as the major goal is to reduce number of machines that have problems with the in-place upgrade. The solution can be updated in future if there is a reason for it. (cherry picked from commit 683176d)
RHEL-21891 (cherry picked from commit 8d84c02)
(cherry picked from commit f154c65)
(cherry picked from commit 6dc1621)
Mitigation of an error where instead of no argument an "empty argument" was passed to `lscpu` lscpu '' vs. lscpu (cherry picked from commit a5bd254)
…#1210) fixes upgrade warnings: leapp.workflow.Applications.transition_systemd_services_states: Actor is trying to produce a message of type "<class 'leapp.reporting.Report'>" without mentioning it explicitely in the actor's "produces" tuple. The message will be ignored (cherry picked from commit 5e51626)
Now for basic sanity test verification in upstream tests tagged by 'tier0' will be used instead of 'sanity'. RHELMISC-3211 (cherry picked from commit 346b741)
Some users hasn't read the upgrade documentation and are not aware that after the reboot the actual upgrade is processing. As they wait just for the ssh connection, they think that something is wrong and sometimes reboot the machine, interrupting the entire process, making the machine broken in some cases. Adding info that they need a console access in case they want to watch the upgrade progress. Jira: https://issues.redhat.com/browse/RHEL-27231 (cherry picked from commit 0d90412)
…d kernels. (oamg#1193) On some upgrades, any kernel commandline args that we were adding were added to the default kernel but once the user installed a new kernel, those args were not propogated to the new kernel. This was happening on S390x and on RHEL7=>8 upgrades. To fix this, we add the kernel commandline args to both the default kernel and to the defaults for all kernels. On S390x and upgrades to RHEL9 or greater, this is done by placing the kernel cmdline arguments into the /etc/kernel/cmdline file. On upgrades to RHEL <= 8 for all architectures other than S390x, this is done by having grub2-editenv modify the /boot/grub2/grubenv file. Jira: RHEL-26840, OAMG-10424 (cherry picked from commit 2b27cfb)
Previous solution created the file without adding the newline in the end of the file. The original solution worked, but to stay on the safe side, adding the expected new line. Jira: RHEL-26840, OAMG-10424 (cherry picked from commit 0869ab1)
…EL10 The previous DDDD.json file contains some invalid entries, mainly for detection of supported CPU families and models. This set corrects the current known issues. Also the file contains crafted data for EL10. Note that EL10 data does not have to necessarily reflect the reality right now and changes are expected to be coming frequently. Jira: RHEL-34185 (cherry picked from commit 9d0925f)
Thew new data contains up-to-date inputs from the RHEL engineering. Also, the sorting has been improved to hopefully enable us to track better further possible changes in the data set. (cherry picked from commit cb7b77f)
This allows the re-use of the code in the el8toel9 upgrade. (cherry picked from commit 88126ef)
This adds the el8toel9 specific fact scanner/generator for Satellite upgrades. The result of this actor is what drives the actual upgrade actions. (cherry picked from commit 0f70dbf)
When migrating to a new OS, we REINDEX all databases. pulp_ansible ships with an own collation (using ICU), which needs a version refresh after the REINDEX has been done. (cherry picked from commit 720bb13)
We used to just delete the symlinks in /etc/systemd, but with the new systemd actors this doesn't work anymore as they will restore the pre-delete state because they by default aim at having source and target systems match in terms of services. By using SystemdServicesTasks we can explicitly turn those services off and inform all interested parties about this. (cherry picked from commit bad2fb2)
…ation This problem is typical for SAN + FC when the storage needs sometimes more time for the initialisation. Implemented try-sleep loop. Retry the activation of the storage + /usr mounting in 15s. The loop can be repeated 10 times, so total time is 150s right now for the activation. Note that this is not proper solution for the storage initialisation, however we have discovered some obstacles in the bootup process to be able to do it correctly as we would like to. Regarding limited time, we are going to deliver this solution, that should improve the experience and should be safe to not cause regressions for already working functionality. We expect to provide better solution for newer upgrades paths in future (IPU 8->9 and newer). jira: https://issues.redhat.com/browse/RHEL-3344 (cherry picked from commit 64e2c58)
The RPM DB has been moved from /var/lib/rpm to /usr/lib/sysimage/rpm in RHEL 10. Apply the change and create symlink in the original path to the new one as expected. Also rebuild the RPM DB to ensure it's compatible with the new RPM version. Previously the RPM DB was being rebuilt during IPU 8 -> 9. However after discussion with RPM SMEs it has been decided that this is actually very reasonable to do always. So applying for any upgrade path but IPU 7 -> 8 (let's do not change this one so much when it's kind of finished). Co-authored-by: Matej Matuska <[email protected]> (cherry picked from commit 10bda01)
No keys will be obosoleted, however it's expected to define the list. (cherry picked from commit ba58aa9)
(cherry picked from commit 4778d96)
(cherry picked from commit 1176441)
The `onerror` argument of shutil.rmtree is deprecated and replaced by `onexc` since Python 3.12. It accepts callback with the same arguments except for the last one which is now a subclass of `BaseException`. Our callback `libraries.utils.report_and_ignore_shutil_rmtree_error` is thus modified accordingly. (cherry picked from commit 3f7377b)
Originally the actor used distutils.version.LooseVersion to detect newest version of installed kernel package inside target userspace container. But the distutils python module has become deprecated and we should not use it anymore in Python 3.12+. However, we do not expect to see multiple versions of kernel present inside the target userspace container as the container is created during the IPU process from scratch (always). Considering the presence of multiple kernels to be sign of error, which could negatively affect also additional actions later. Updated the solution, raising an error if multiple kernels are detected inside the container. As we expect one kernel only, no need to compare versions of particular possible kernel packages. Note there are now just these cases when this could happen: * a third party package is required to be installed inside the container * an explicit requirement to install particular version of container is made by a custom actor (cherry picked from commit 236483d)
On my server, leapp preupgrade fail with the following error: Traceback (most recent call last): File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py", line 74, in _do_run actor_instance.run(*args, **kwargs) File "/usr/lib/python2.7/site-packages/leapp/actors/__init__.py", line 289, in run self.process(*args) File "/usr/share/leapp-repository/repositories/system_upgrade/el7toel8/actors/scangrubdevpartitionlayout/actor.py", line 18, in process scan_layout_lib.scan_grub_device_partition_layout() File "/usr/share/leapp-repository/repositories/system_upgrade/el7toel8/actors/scangrubdevpartitionlayout/libraries/scan_layout.py", line 91, in scan_grub_device_partition_layout dev_info = get_partition_layout(device) File "/usr/share/leapp-repository/repositories/system_upgrade/el7toel8/actors/scangrubdevpartitionlayout/libraries/scan_layout.py", line 79, in get_partition_layout part_start = int(part_info[2]) if len(part_info) == len(part_all_attrs) else int(part_info[1]) ValueError: invalid literal for int() with base 10: '*' This is caused by the following line: /dev/sda1 * 2048 1026047 512000 fd Linux raid autodetect I have my server on EL7 with / using a Linux RAID so len(part_info) != len(part_all_attrs), hence why it try to convert '*' to int. (cherry picked from commit 7cd469b)
Previously, if the upgrade has been inhibited during TargetTransactionFactsCollectionPhase usually because we could not create (for whatever reason) the target userspace container, the actor checking rpm gpg keys failed with the `Could not check for valid GPG keys` error. This has confused many users as they couldn't know that this is impacted by the problem described in an inhibitor that is below this error. As it's for sure that the upgrade cannot continue when the target user space container has not been created (the TargetUserSpaceInfo msg is missing), we consider it safe to stop the gpg check here silently just with a warning msg instead of raising the error - as this check is important only in case we could actually upgrade. All other possible raised errors are presereved. jira: https://issues.redhat.com/browse/RHEL-30573 Signed-off-by: Petr Stodulka <[email protected]> Signed-off-by: Jakub Jelen <[email protected]> (cherry picked from commit caff3ac)
This issue could cause false positive reports when the user has the configuration options such as "Subsystem sftp" defined in included file only. Resolves: RHEL-33902 Signed-off-by: Jakub Jelen <[email protected]> Co-Authored-By: Michal Hecko <[email protected]> do not use filesystem during tests (cherry picked from commit 998b774)
We'd like to read and eventually check NetworkManager configuration on upgrade from RHEL 9 to RHEL 10. (cherry picked from commit 0087f4e)
We'd like to read and eventually check NetworkManager configuration on upgrade from RHEL 9 to RHEL 10. (cherry picked from commit f9dc5cb)
RHEL 10 is going to no longer support configuring NetworkManager to use the dhclient DHCP client (in fact, dhclient will not be present at all). Let's add an actor to deal with this sort of configuration. In particulare, make sure the users review their configuration, on chance they configured NM this way intentionally and they actually rely on some dhclient configuration or behavior. Jira: https://issues.redhat.com/browse/RHEL-46975 (cherry picked from commit 95ce086)
The run function was mocked instead of `leapp.libraries.common.grub.get_grub_devices`, making test fail when ran directly on the host, not in containers, because gru2-probe failed there as expected. Left a couple of TODOs to improve error handling in the grub library. Jira: OAMG-11647 (cherry picked from commit d62efb2)
The setup_target_rhui_access_if_needed() function is not covered by tests and makes tests fail outside of containers because it calls commands (api.run) which are not mocked. Jira: OAMG-11647 (cherry picked from commit a8c5664)
During the RHEL 9.5 to RHEL 10.0 Leapp preupgrade process, the check_ipa_server actor crashes due to a KeyError while trying to find the URL for the 9-to-10 migration guide. Add a URL for the key '9' and make the code more robust by defaulting to "TBD" if the key is not found in the dictionary storing version/url-of_migration guide. Fixes: https://issues.redhat.com/browse/RHEL-50829 Co-authored-by: Vojtech Sokol <[email protected]> Signed-off-by: Florence Blanc-Renaud <[email protected]> (cherry picked from commit 896fe43)
Reverting commit 60f500e The original commit only workarounded the root cause - leaked file descriptors in the leapp stdlib when using the `run` function. Dropping the change in the actor as it is not needed anymore. relates: oamg/leapp#880 (cherry picked from commit 24700ee)
(cherry picked from commit 4514e83)
Due to an incompatibility of RHEL8 bootloader with newer versions of kernel on RHEL9 since version 9.5 the upgrade cannot be performed as the old bootloader is used to load the new versions of kernel during the upgrade. JIRA: [41193](https://issues.redhat.com/browse/RHEL-41193) [52993](https://issues.redhat.com/browse/RHEL-52993) (cherry picked from commit 16fb443)
Provided data streams bumped to 3.1 only (cherry picked from commit a757c6d)
Actors: * check_leftover_packages * remove_leftover_packages * report_leftover_packages Changes: * Move actors code from el7toel8 to common * Refactor some actors * Put their processes into library * Create unit tests for actors Jira: OAMG-1254 (cherry picked from commit 9e8ac26)
…rtleftoverpackages/libraries/reportleftoverpackages.py (cherry picked from commit 6861231)
This way we still get unit test output even if the linters fail. (cherry picked from commit 7426f22)
The experimental "live mode" feature that allows booting into a squashfs image of the target userspace and running leapp via a service builds on several models added by this commit. The most important role has the LiveModeConfig model, storing user-defined configuration of the feature. Jira ref: RHEL-45280 (cherry picked from commit fb5a815)
Modify core actors to support upgrades with "live mode". Whereas live mode implies a new, separate, code path for generating the live image initramfs, the changes introduced in add_upgrade_boot_entry actor interfere deeply with the old implementation. Kernel cmdline arguments for the created boot entry are now manipulated uniformly, avoiding ad- hoc string formatting. It is also possible to remove kernel cmdline args from the entry. Addition of arguments precedes removal, i.e., if arg=value should be added and also removed, it will be removed. The root cmdline parameter is modified separately, due to a bug in grubby. Jira ref: RHEL-45280 (cherry picked from commit b807c27)
Add actors that scan the new configuration file devel-livemode.ini, informing the rest of the actor collective about the configuration. Based on this configuration, additional packages are requested to be installed into the target userspace. The target userspace is also modified to contain services that execute leapp based on kernel cmdline. For a full list of modifications, see models/livemode.py added in a previous commit. The feature can be enabled by setting LEAPP_UNSUPPORTED=1 together with LEAPP_DEVEL_ENABLE_LIVE_MODE=1. Note, that the squashfs-tools package must be installed (otherwise an error will be raised). The live mode feature is currently tested only for x86_64, and, therefore, attempting to use this feature on a different architecture will be prohibited by the implementation. Jira ref: RHEL-45280 (cherry picked from commit d253c21)
## Packaging - Start building for EL 9 in the upstream repository on COPR (oamg#1169) ## Upgrade handling ### Fixes - Add missing RHUI GCP config info for RHEL for SAP (oamg#1253) - Fix creation of the post upgrade report about changes in states of systemd services (oamg#1210) - Fix detection of valid sshd config with internal-sftp subsystem in Leapp (oamg#1212) - Fix evaluation of PES data (oamg#1194) - Fix failing "update-ca-trust" command caused by missing util-linux package (oamg#1169) - Fix handling of versions in RHUI configuration for ELS and SAP upgrades (oamg#1240) - Fix the parsing of the lscpu output (oamg#1184, oamg#1208) - Fix the upgrade of systems using RHUI on AWS after changes in RHUI client package (oamg#1178) - Fix upgrade on aarch64 via RHUI on AWS (oamg#1240) - Handle a false positive GPG check error when TargetUserSpaceInfo is missing (oamg#1269) - Target by default always "GA" channel repositories unless a different channel is specified for the leapp execution (oamg#1205) - Update the default kernel cmdline (oamg#1193, oamg#1216) - Update the device driver deprecation data, fixing invalid fields for some AMD CPUs (oamg#1211) - Wait for the storage initialization when /usr is on separate file system - covering SAN (oamg#1218, oamg#1219) - [IPU 7 -> 8] Drop enforced tomcat removal for satellite when upgrading to RHEL 8.10 (oamg#1243) - [IPU 7 -> 8] Fix detection of bootable device on RAID (oamg#1260) - [IPU 8 -> 9] Inhibit the upgrade to RHEL 9.5 on ARM architecture due to incompatibility of the RHEL 8 bootloader and RHEL 9.5 kernel (oamg#1270) ### Enhancements - [IPU 8 -> 9] Introduce upgrade path 8.10 -> 9.5 (oamg#1245, oamg#1246) - Update leapp data files (oamg#1280) - Apply solutions for leftover rpms for all major upgrade paths - including experimental actors (oamg#1199) - Do not terminate the upgrade dracut module execution anymore if /sysroot/root/tmp_leapp_py3/.leapp_upgrade_failed exists (oamg#1197) - Improve set_systemd_services_states logging (oamg#1213) - Include leapp command execution and defined leapp envars inside leapp.db - (oamg#1152) - Introduce experimental upgrades in 'live' mode for the testing (oamg#1248) - Load obsoleted GPG keys from gpg-signatures.json file instead of hardcoding them (oamg#1241) - Several minor improvements in messages printed in console output (oamg#1173, oamg#1214, oamg#1274) - Several minor improvements in report and error messages (oamg#1207, oamg#1217, oamg#1234, oamg#1235, oamg#1242) - Sort lists in dnf-plugin-data for easier overview (oamg#1231) - [IPU 7 -> 8] Allow upgrade of content from ELS repositories (oamg#1198) - [IPU 7 -> 8] Inhibit the upgrade when Legacy GRUB is detected (oamg#1206) - [IPU 7 -> 8] Inhibit the upgrade when embedding area is small to prevent failed bootloader update (oamg#1195) - [IPU 8 -> 9] Enable EL 8 > 9 upgrades on Alibaba cloud (oamg#1249) - [IPU 8 -> 9] Enable EL 8 to 9 upgrade of Satellite/Foreman server (oamg#1181) - [IPU 9 -> 10] Introduced number of changes to enable IPU 9 -> 10 for testing (oamg#1169) - [IPU 9 -> 10] Prevent upgrading if NetworkManager is configured with dhcp=dhclient (oamg#1268) - [IPU 9 -> 10] Update URLs in reports to reflect the next planned major upgrade path (oamg#1169, oamg#1273) ## Additional changes interesting for devels - drop unused `packager` field from gpg-signatures.json (oamg#1233) - [IPU 9 -> 10] make system_upgrade/common leapp repo Python 3.12 compatible - [IPU 9 -> 10] introduced system_upgrade/el9toel10 leapp repo (cherry picked from commit 03c257b)
Thank you for contributing to the Leapp project!Please note that every PR needs to comply with the Leapp Guidelines and must pass all tests in order to be mergeable.
Packit will automatically schedule regression tests for this PR's build and latest upstream leapp build. If you need a different version of leapp from PR#42, use It is possible to schedule specific on-demand tests as well. Currently 2 test sets are supported,
[Deprecated] To launch on-demand regression testing public members of oamg organization can leave the following comment:
Please open ticket in case you experience technical problem with the CI. (RH internal only) Note: In case there are problems with tests not being triggered automatically on new PR/commit or pending for a long time, please contact leapp-infra. |
Bump VERSION_FORMAT to 1.2.1 (in resolution of cherry-picking f154c65)
Release 0.21.0 (oamg#1282)
Packaging
Upgrade handling
Fixes
Report
in produces of transitionsystemdservicesstates oamg/leapp-repository#1210)Enhancements
Additional changes interesting for devels
packager
field from gpg-signatures.json (drop unused packager field from distro metadata oamg/leapp-repository#1233)