Having been in the industry for four decades allows you to see both trends and hype from the global IT marketing machines. In this article I am glancing back over my career but also looking forward and challenging some of the ways solution architecture makes decisions and why NonStop is still out ahead.
I have just completed 30 years with NonStop, my first and last employment with a computer vendor. In that time I have seen client/server, distributed computing, web services, business intelligence, virtualisation, cloud and of course FinTech all do the rounds. Tandem Computers was acquired by Compaq Computer, who then purchased Digital Equipment Corporation, merged with Hewlett Packard Inc and purchased Electronic Data Systems. HP then decided to demerge ITIL and Security Software, Enterprise Services and importantly the PC and Printer business to focus on the enterprise for data driven, cloud enabled, intelligent edge computing as HPE.
Mainframe to Distributed Processing
If I look back to my introduction to corporate IT then this was not with Tandem NonStop but as an IBM Mainframe PL/1 programmer, the defacto hardware standard of the age. Solutions were initially batch-based but then moved on to COBOL CICS front end processing to a DL1 database with overnight batch processing for manufacturing environments. I was also an early adopter for on-line transaction processing using the distributed IBM 8100 system; programming in 3790 Assembler. Data was collected in the evening and transferred to the overnight batch ERP runs on the mainframe. Apart from the technical skills there were many learning experiences here including multi-user applications, coding with care due to assemble times and the challenges of assembler debugging, transferring data from application code that uses upper and lower case to one that expects all upper case characters. The IBM mainframe could be partitioned into regions to run different classes of work but issues in one region (often CICS) meant the whole machine had to be re-IPL’d ………… a classic availability issue. Already we see some of the “new paradigm” themes for later IT global marketing are variations on existing computer capabilities.
PCs, Ad-hoc Reporting and Business Intelligence
In the mid-80s I changed my role in corporate IT to create an Information Centre for a large vehicle manufacturer. The Information Centre was about word processing, PC-based computing, ad-hoc report building and also the creation of graphs and reporting for information analysis. Yes, this was undoubtedly the forerunner of Business Intelligence where a combination of PC solutions and mainframe graphics and report processing would be used on the transactional ERM and engineering data held within the company. I went to one of the earliest Oracle Corporation pitch seminars which was solely pitched to CFOs as the business analysis reporting solution without needing IT. As a department we extensively used IBM’s Application System for graphical reporting and ad-hoc data analysis. The problem was the challenges of getting data into the analysis solutions, the cost of compute and particularly the cost of storage. This was way before Tandem started their Decision Support System initiative. As hardware costs have dropped, software has been used to exploit cheaper compute and storage to provide more solutions – we start to see a theme evolve.
NonStop – ahead of its time?
It was after these roles that I discovered “religion” in the form of Tandem NonStop. When I took the role I had never been anywhere near a NonStop. There was a strange “passion” from senior colleagues and also customers.
One of my earliest engagements was with a large SQL/MP implementation and it was here I quickly learned about designing for the architecture and database dos and don’ts. SQL RDBMS was still relatively new for transaction processing and there were a number of large opportunities being pursued across various geographies. What we saw in benchmarks was that parallelism beats a large, costly SMP system on performance and price and you get fault tolerance thrown in as well. This is a lesson which needs repeating because the value is often forgotten as clever IT marketing from vendors obviously downplay their shortfalls and compromises. In addition the 90s saw a movement to take transactional data capture and validation off to intelligent PCs as the Client/Server model took hold at this time. This was driven by the power of the PC to not only validate data but to render graphical interfaces far more competitively than the main server environments could manage. Tandem NonStop, with Pathway/TS, already had a client-server architecture with screen COBOL and serverclasses.
Breaking this out into a PC client front end to new or existing serverclasses was relatively straight forward using the RSC interface. As we moved through the 90s and the Internet started to gain ground we saw the emergence of thin client front ends using Java servlets. Pathway iTS was a way Tandem NonStop could support different client interfaces all to the same serverclass back end and still able to take advantage of the way processes were distributed for performance and availability on the NonStop platform.
Client Server and Web Services Architectural Exploitation
As usual in IT, the pressure from Finance coming down into the IT organisation caused architects and development shops to try and move major portions of not just the client but also the server components to the cheaper hardware. Load balancing and replications allowed some scalability and availability to be built into the solutions using lower cost hardware; but there were complex connectivity and database replica problems to solve and single points of failure within the overall architecture. Availability was of a “failover” type meaning it wasn’t transparent and there was a delay whilst transactions were backed out, database consistency re- established and connections moved to the backup infrastructure. On smaller low volume environments this is less of an issue, but on high volume, 24×7 mission critical environments the business is impacted. Meanwhile NonStop exceeded 25 years of delivering consistent, scalable, fault tolerant reliability with none of the deployment complexities seen in the architectures with low cost hardware infrastructures.
As the new millennium emerged, so did the exploitation of the internet. Tiered architectures with web servers in a DMZ, application tiers and database tiers became the norm. The adoption of Java and frameworks accelerated the re-use of application components and a movement to being able to port application code easily across different platforms. Web Services and Service Oriented Architectures became the new paradigm but along with this came increasing complex connectivity, security vulnerability points and replication for high availability of stateful component tiers and persisting data to databases.
NonStop once again used the learning of Pathway/TS to provide Web Services and SOA allowing clients on or off platform whilst creating the middleware layer to provide fault tolerance and scalability for these application architectures. Java frameworks and later Java Enterprise were also supported, enabling common development tools to be used, but the application then transparently inherited scalability and availability. Additions such as JI (Java Infrastructure) and JEnscribe allowed simple things like a socket call to be transparently mapped to a TS/MP pathsend serverclass invocation and access to Enscribe files from Java. Where stateful information is required within an application by using NSIMC, then this can be held in a fault tolerant in-memory cache.
Reduced complexity, simpler deployment, consistent service levels are part of the benefits of using NonStop whilst transparently gaining fault tolerance and scalability to applications written with standard tools. When examining the capability closely then NonStop is primarily a software product bringing tremendous value to applications developers and the business.
If you consider an available and scalable web application on NonStop then the simplicity stands out. Not only does the middleware, which includes a deep port of Apache Tomcat, use the NonStop architecture to scale but the process pairs and persistence make the solution available retaining the no single point of failure NonStop design. If a CPU should fail then the system, application and database continues to run and the engineering action is to simply replace an electrically isolated Blade which is the NonStop logical CPU.
Benefits of NonStop on x86 – flexibility, choice and dynamic capacity
As NonStop has evolved into being an architecture which can be deployed on Intel x86-based NSX hardware then a number of evolutionary benefits have been enjoyed. These include core licensing which enables the logical CPUs to be run in different core configurations via an on-line license key. The enterprise configurations are 2, 4 or 6 cores per CPU and this gives significant increases in CPU capacity without undertaking any application or database balancing reconfiguration. This has been taken one step further with the introduction on NonStop Dynamic Capacity (NSDC) licensing where capacity can be temporarily increased by an additional 2 cores per logical CPU using an NSDC license. This allows the occasional high peak workload to be handled without the permanent cost of a higher license for additional cores. In addition InifiBand brings far greater bandwidth allowing x86-based NonStop systems to use “all flash” storage with much higher IO capability than HDDs. NSX also allows 10Gb Ethernet, which is becoming standard in most data centres, and then using VLAN and Multiple Providers the IP CLIM ports can be configured to take advantage of this higher bandwidth. NSX also brings with it deployment alternatives for NonStop using either a traditional “converged” physical rack-mount options or the deployment in a private cloud as a Virtualised NonStop under VMware or OpenStack KVM hypervisors. Flexibility and choice is the new norm for NonStop.
Database Persistence with Scale
Persisting data though a database onto storage has been key to ensuring integrity of computer-based applications since their introduction. As computerised back-end accounting systems were overtaken by on-line transaction processing as the norm then data integrity became more essential as well as making data more available with less off-line housekeeping. Consistency between transactions and database updates were required in case of failure such that the database of record could have its integrity maintained. The movement towards thick and internet-based client server architectures served to ever increase the potential amount of concurrent transaction load which could be generated; all requiring database consistency and data integrity. In an era where virtualization and cloud deployment options are being exploited for faster time to market and the ability to increase compute resources almost dynamically, the pressures on databases to scale, be permanently available and also have data integrity keep on growing. The rise of analytics and business intelligence has also lead to a rethink of which database is used for what.
In many respects there are now two major strands to the database architectures, one around structured data which can include transaction processing and business intelligence and the other is around unstructured data for big data and analytics. The former is dominated by SQL and SMP architectures and the latter by columnar designs and MPP.
Merchant databases have been addressing the issue of database availability by using clustering. This involves a cluster-aware DBMS and software with replica, shared storage and an enterprise database licensing scheme and support. Though this brings about additional resilience at the database tier it is not transparent to the application and typically takes some minutes or longer whilst the failover is completed and updates are backed out to regain database consistency. The issue of scalability with SMP merchant databases is more complex; parallel options have been introduced as an extra license and support charge. Scaling up has relied on more powerful SMP systems but with their system interconnects and NUMA architectures these are expensive and cannot be easily changed without downtime. In a bid to try and add additional scalability and availability then real application clusters have been introduced. These are complex designs and rely on distributed locking as the way to provide data consistency across the nodes in the cluster. The licensing and support is a further step-up from cluster-aware Enterprise Licensing. The issue with real application clusters and distributed lock management is that the nodes in the cluster have to flush around their cache for the locking strategies and this becomes the bottleneck leading to a realistic limit of four nodes in most cases. Therefore in the SMP world of merchant databases a compromise between availability and scalability has to be made and the software and architectures differ making it more difficult to change solutions without some application design work. Real Application Cluster represents only around 5% of Oracle production deployments. It is true, however, that these merchant databases have worked hard to ease the deployment in a cloud environment such that templates and scripts make it easier for a database instance to be deployed for development and proof of concepts.
The NonStop SQL database is a structured database with both fault tolerance and scalable out of the box. It takes advantage of the underlying NonStop MPP architecture and has parallelism built in. It also uses process pair technology in the disc processes to provide fault tolerance so that scalability does not have to be had at the expense of availability.
The architecture of NonStop SQL is that the disk access manager has all the locks for the data on the disk it is responsible for, therefore there is no requirement for a distributed lock manager and the associated performance penalty that it brings.
Disk Processes can be added or moved on-line with the latest x86-based NonStop OS, meaning that a database can be scaled along with additional compute on-line. Although NonStop SQL is a structured database it has the ability to supported a mixed workload environment due to the way the data access manager to the disk is implemented. This means transactional work and business intelligence could be run on the same system, if required, maintaining the response time critical work at a higher priority. The system frequently checks to see if there is higher priority work sitting on the ready queue. There is one license and support charge covering all NonStop SQL features and this, combined with the fact application architectures do not have to be changed to gain scalability and availability, reduces complexity and cost of ownership.
The benefits of providing an Intel x86 centred deployment for HPE NonStop has already been outlined. However Virtualised NonStop and private cloud has also brought about further innovations to the NonStop SQL database. There were some key goals for the Database Services enhancements in NonStop SQL and these are applicable whether a converged or virtualised NonStop deployment is being used.
- Self Service for rapid provisioning of database instances with all the attributes of connectors, security and storage allocation
- Agile deployment with refactoring capabilities though Liquibase support (e.g. schema version management)
- Use of both NonStop and standardised DBA tools for deployment, management and analytics (e.g. SQLXpress, NSDA and DB Visualizer)
- Multi-tenancy taking advantage of the scalability, parallelism and fault tolerance (multiple database instances with their own isolation, security and resources but with a single license and support fee saving up to 50% of the cost of ownership)
- Compatibility Layer (assisting application ports where the previous database access was not using a full ANSI SQL standard)
- Third party tools like Inspirer SQLways code conversion
- Analytics partners such as STRIIM for continuous ingestion of data into big data analytics (as well as open source tools access such as Kafka and ELK for log consolidation, dashboards and AI)
Reflecting upon the changes in IT since I entered the industry, in many respects there has been evolution rather than revolution and this has been driven primarily by the reduction and size and cost of hardware whilst increasing the underlying performance. This miniaturisation, performance and dramatic fall in cost has allowed software developers to exploit computerisation for new solutions and drive the use of client interfaces ever more into public touch points. Time share and partitioning were solutions seen over 40 years ago and in some senses can be seen as the forerunner to cloud and virtualisation.
As client solutions and computerisation has become ubiquitous then the ability to compute 24×7 and with ever increasing concurrent workloads is being seen. Time to market and time to value is driving the agility aspect of development and the various languages and frameworks being used. Given not all the initiatives succeed then repurposing hardware is another driver which is where virtualisation and cloud play key parts.
The challenge all this fast paced software development and deployment faces is that often these initiatives start on the lowest common denominator. Clever architectures and “plumbing” various tiers together can provide some ability to support higher availability and scalability as the applications become adopted in a commodity “one size fits all” approach. This comes at a high cost of complexity, requiring enterprise licensing for higher level capability, software components that are cluster aware and re-architecting application architectures to overcome roadblocks. The many tiered architecture also introduces potential points of security vulnerability. As mission criticality for true 24×7 and the ability to scale easily and potentially on-line becomes more important, the complexity of using the lower common denominator hardware solutions become more costly and more of an inhibitor. This is particularly true of the database tier.
HPE NonStop has a long history of proven ability to support some of the highest demands of scalability and availability. The application components are tightly coupled with data integrity for transaction processing and client facing touch points. The application architecture was ahead of its time being built in a client/server modular way and exploiting the parallelism of the underlying MPP hardware. This has served well as IT moved towards thick client and then the internet whilst still providing a fault tolerant software stack; which has transparent take-over of a backup process rather than a non-transparent failover seen in HA clustering (whether using traditional or modern development languages). This is as true of the NonStop database as it is of the underlying TS/MP application tier and of course all these can be part of a single system image reducing complexity and points of security vulnerability.
The recent NonStop X x86-based innovations of deployment as a virtualised system in a private cloud, core and dynamic capacity licensing, on-line disc process migration (impacting the data access managers) and database services brings all the fault tolerance and parallel scalability from the NonStop heritage. Together with the cost benefits of multi-tenant multiple and isolated database instances all within one license and support cost radically reduces NonStop SQL cost of ownership compared to merchant solutions and avoids costly application architecture redesign since you do not have to compromise between scalability and availability. What is more the lack of complexity provides consistency of service levels and the “out of the box” scalability and fault tolerance avoids the needs for complex and costly re-integration performance re-testing when software revisions change. Typical tools DBAs use such as DB Visualizer can be used with NonStop SQL, minimising retraining, and analytics from NSDA helps with managing and tuning queries.
The benefits of simplicity in allowing developers, DBA and systems managers to focus on innovation rather than maintenance cannot be overstated. HPE NonStop, as a fault tolerant MPP system, is as relevant to highly available and scalable transaction processing today as it ever has been and could be said to have always been ahead of its time.
Article written by Iain Liston-Brown August 2019 with acknowledgements to Moore Ewing and Justin Simonds