NonStop Trends and Wins

Computing in space presents unique challenges from both environmental hazards and resource constraints. I’m hoping that many of you were able to attend the 2024 NonStop TBC that took place in Monterey, California, and were able to hear the keynote from Dr. Mark Fernandez, who heads up HPE’s ‘computing in space’ program. HPE, specifically Mark, is working with NASA to identify space computing issues. HPE has put a few systems in place at the International Space Station and has achieved some excellent results.

Let’s consider some environmental issues that must be dealt with. Space computers are exposed to high levels of cosmic radiation, which can cause significant errors in data processing. High-energy particles can flip memory bits, leading to data corruption or system failures. This necessitates the development of radiation-hardened (rad-hard) systems that can withstand these conditions. Also, space’s gravity (or lack thereof) environment complicates standard cooling methods used on Earth, such as heat sinks and fans. Spacecraft must endure extreme temperature fluctuations, which can affect the performance and reliability of onboard computing systems. Computers must also be designed to withstand severe shock and vibration during launch. These physical stresses can damage sensitive electronic components.

The resource issues are just as daunting. Space missions require strict limitations on power consumption and cooling. The absence of conventional cooling systems means that designs must be highly efficient regarding energy use, which can limit computational capacity. Spacecraft have stringent size and weight constraints, making it challenging to incorporate advanced computing without exceeding those limits. This affects the complexity of systems that can be deployed in space. The bandwidth available for communication between spacecraft and Earth is often limited, complicating data transfer and real-time decision-making. As a result, onboard computers must handle more data processing autonomously rather than relying on ground control.

In Mark’s talk, he demonstrated the requirements and success of edge computing. By having a system in the space station, many of the tests that had to be transferred to ground control could be performed at the space station. For example, whenever a spacewalk is performed, the space suit must be examined closely for wear. Any slight defect could prove fatal to the astronaut. Prior to the Proliant DL360 being on the space station, detailed images (photos) of the space suit needed to be transmitted to the ground station and evaluated, which took 24-48 hours. With the DL360 processing locally, this was performed in minutes—quite a saving.

At NonStop TBC 2024, many asked if we were booting NonStop on the space station. At present, we cannot due to a lack of fabric and a second node. There is a single DL360 and a single Edgeline 8000. To boot NonStop, we need a second system and a connecting fabric. Keith Moore and I were working with development to see if we could find a way around this, but NonStop is very strict in that regard (as we’d expect). We hope to boot a virtual NonStop on some future missions when we might have a dual processor Edgeline with an internal fabric. We’ll see.

It seems like a perfect operating system for space. Repair time for a down system could be weeks to months or years. Standard fault tolerance for most platforms requires a backup system. NonStops’ unique design really is an N+1 arrangement where the backup components are all built into the existing system, so a NonStop would have many fewer components and take up much less of the precious space required for space travel. It would also require less power and cooling than a traditional clustered solution. Hopefully, we will see NonStop in space before too long.

Author

Justin Simonds

Justin Simonds is a Master Technologist for the Americans Enterprise Solutions and Architecture group (ESA) under the mission- critical division of Hewlett Packard Enterprise. His focus is on emerging technologies, business intelligence for major accounts and strategic business development. He has worked on Internet of Things (IoT) initiatives and integration architectures for improving the reliability of IoT offerings. He has been involved in the AI/ML HPE initiatives around financial services and fraud analysis and was an early member of the Blockchain/MC-DLT strategy. He has written articles and whitepapers for internal publication on TCO/ROI, availability, business intelligence, Internet of Things, Blockchain and Converged Infrastructure. He has been published in Connect/Converge and Connection magazine. He is a featured speaker at HPE’s Technology Forum and at HPE’s Aspire and Bootcamp conferences and at industry conferences such as the XLDB Conference at Stanford, IIBA, ISACA and the Metropolitan Solutions Conference.

View all posts

The Connection

A Journal for the HPE NonStop Business Technology Community

Author

Be the first to comment

Leave a Reply Cancel reply