This article explains how to leverage two IBM Power Server features to reduce downtime for maintenance and migrations and unplanned outages. Let’s start by reviewing some important PowerVM features that give customers more flexibility to keep their LPARs up and running for a variety of data center changes.
What is Live Partition Mobility (LPM)?
LPM allows you to move an LPAR, complete with running applications, from one Power Systems server to another without an application outage. LPM can help you:
- Perform scheduled hardware/firmware maintenance that requires a server outage (server evacuation) without taking applications down
- Rearrange LPARs between servers to address requirements for more processing power or memory (workload rebalancing)
- Migrate LPARs from old hardware to new hardware as part of hardware refresh (i.e. POWER7 -> POWER8, POWER8 -> POWER9)
LPM has been available on the Power platform since 2008. IBM i supported LPM starting in 2012 with IBM i 7.1 Technology Refresh 4. Dawn wrote about IBM i support of LPM in the blog, Move my i.
What is Simplified Remote Restart (SRR)?
When a server crashes, the partitions on that server also crash and you have to wait for the IBM service representative to come on-site and repair the server.
SRR allows you to move and restart an LPAR within minutes after a Power System server has crashed. (Before the IBM Sales team gets after me, your server should rarely, if ever, crash but if it does, you want to be ready.) Since the server has crashed, LPM can’t be used to move the LPAR. SRR uses many of the basic underlying principles of LPM so that it can rebuild the LPAR on another system.
You can think of SRR as “LPM for when a server has crashed.” Any LPAR that can be LPMed when a server is up and running can be a candidate for SRR when a server crashes.
This is has been a feature on POWER8 servers since 2014, including support for IBM i.
So why am I writing about these features in this i Can blog?
While LPM has been around for almost a decade and SRR is relatively new, both of these technologies are product differentiators for the Power Systems platform.
And we are still finding clients that aren’t using this technology. Many clients point to LPM as being a game changer for their environment because there’s no downtime need for LPARs for hardware and/or firmware maintenance or for VIOS updates. The clients just LPM the LPARs from the frame under maintenance to other frames. When the maintenance is done, they LPM the LPARs back.
Similarly, LPARs can now move to new frames as part of a hardware refresh, which speeds up the migration process and allows the older frames to be decommissioned sooner.
SRR is a different benefit. Up until POWER8 and this feature, if your Power Systems server crashed, the LPARs on that server were down also—no restarting them elsewhere. As more and more clients have two and three shifts of application developers/quality assurance using the servers, any downtime of non-production LPARs keeps these groups from getting their work done.
Another reason we are talking about these features now is that there is a tool from IBM that helps clients do LPM and SRR much quicker than traditional methods: the PowerVM LPM/SRR Automation Tool. When you need to do maintenance and want to use LPM to move all the LPARs, this tool can do that in just a few clicks on the GUI and then after the maintenance is done, return all the LPARS back with a few clicks.
SRR is also supported in the tool and also allows just a few clicks to restart LPARs after a crash and a few clicks to return the LPARs back after the server is repaired.
This tool is geared for both low-skilled admins and high-skilled admins. Next week, part two will review the features of this tool.
See the presentation PowerVM LPM and SRR Automation Tool for the latest information.
This week’s blog was written by Bob Foster, and is the first in a two-part series. Bob is a member of the IBM Lab Services Power Systems Delivery Practice. Bob has worked with PowerVM technologies for a number of years, and has focused on live partition mobility and simplified remote restart, including developing a Lab Services offering to assist clients with simplifying their use of LPM and SRR.
This blog post was edited to fix broken links on November 11, 2020.
This blog post was originally published on IBMSystemsMag.com and is reproduced here by permission of IBM Systems Media.