Designing and Implementing Robust Solutions
No matter where you live, you have most likely heard about the Iowa Caucus fiasco here in the USA this week. This is not going to be a political post, but what happened is something I have seen too many times in all my years working both on the dev and IT sides.
There are numerous topics that could be discussed in greater detail – the failure of redundancy (the availability guy in me is screaming to talk about this), the lack of testing (my QA background is also yelling loudly), but all of this points to a larger culprit: designing the solution properly to have the performance, availability, reliability, scalability, and security required. Do not forget the solution has to be usable by end users and easily maintained not to mentioned tested. The word “robust” is a good single word to sum it all up. According to this 9to5Mac post that has a good, concise summary of what is known to date (and a link to a bigger New York Times article), most of that didn’t happen or just flat out failed. Then there is accounting for data inconsistencies, which according to the New York Times, is a coding problem in the app. ArsTechnica also did a good writeup.
The Iowa Democratic Party issued a statement. It highlights ALL of what I said above, but especially around testing. The fact the data collected was right but the reporting was incorrect is damning. As a data person that makes me angry. Did the reports actually meet requirements and did they test with real (or at least simulated real) data? So many things …
You may ascertain this is not my first rodeo. Whether you buy a packaged application, roll your own, or customize an off-the-shelf package, what worked then may not work now. There’s a difference in scale with few users or a very different workload. How you handle 100 rows is very different than millions or billions. That is not necessarily a failure of design because at the time many applications were designed with what was known. Sometimes things grow organically. Other times some aspects are overlooked with the “we’ll deal with that later” mentality. It is always better to get it right (or as close to it …) as possible the first time around. Retrofitting, or worse, replacing existing solutions is expensive in every aspect of the word and may cause outages to do so. Hire smart people (FTEs or consultants) to help you. It is not a sign of weakness to admit you may not know what you don’t know and you need assistance. What does right look like? It would include, but is not limited to:
- Proper application design
- Proper schema design
- Accounting for RTOs and RPOs for all components
- Testing using production (or close to production-like) data and masked if necessary – functional, at scale (works on my machine with no load is not testing at scale …), and full business continuity
- Securing everything but not hampering functionality and scalability while doing so (if possible)
- A monitoring and overall management strategy that works including backup and restore
- Take any lessons learned and drive them into improvements
All of this is valid whether you are fully on premises (physical or virtual), in a public cloud, or using a hybrid which bridges both. Choice of technology or platform does not matter. All of the fundamental principles are the same. Crap in, crap out. The cloud will not fix your poorly designed application nor will throwing more resources often resolve all – if any – issues. At some point you need to take a holistic approach; touching one thing has tentacles elsewhere. It is rare that is not the case.
The right design is always an end-to-end approach, no matter if it is an existing implementation or something new. The application matters just as much as the backend. Over the years, we’ve helped customers achieve their performance, availability, and security goals by helping them implement solutions that work. SQLHA has worked with some of the largest systems on the planet. If you are not a large company, bank, or government, that does not mean your solution needs any less robustness or aspects of mission criticality. We do this with companies of ALL sizes.
If you want to avoid scenarios like the Iowa Caucus, contact us today. We’ve been there, done that and will roll up our sleeves to get you where you need to go.