Dougie's Enterprise Management World: Product Evaluation

Showing posts with label Product Evaluation. Show all posts

Sunday, July 1, 2012

ENSM Products are Commodities?

On your quest to put in network and systems management capabilities, you have to figure in several explicit and implicit factors related to your end goals. What I mean is that while it's easy to go to your Framework Vendor of choice, break out the Bill of Materials spreadsheet, and sit down with the Sales person and go through the elements you would need for your environment, it may be filled with hidden challenges. And some challenges may be harder to overcome than others once you have signed the check.

Don't forget, these products don't magically install and run themselves. They take care and feeding. Some more than others. And the more complex it is, the more complex it is to figure out when something goes awry.

Sounds so easy! After all, all of these products are commodities. And buying from a single vendor gives you a single point of support... and blame. In effect, a single "throat to choke". NOTHING could be further from the truth!

Most of the big vendor's product frameworks are aggregations and conglomerations of products that have been acquired, some overlapping, into what looks like a somewhat unified solution. In many cases, it is only after you buy the product framework that you discover stuff like there are different portals with different products and these portals don't effectively integrate together. Or you may find the north bound interface of one product is a kludge to somewhat loosely fit the two products together. Or two products use competing Java versions.

Some vendors product suites have become more and more complex as new releases are GAed. In many cases, these new levels of complexity have a profound impact on your ability to install, administer, or diagnose issues as they arise.

First up - Where are your requirements? Do you know the numbers and types of elements in your environment? What about the applications? How do these apply to Service Level Agreements? Do you have varying levels of maintenance and support for the components in your environment?

Do you know who the users will be? Have you defined your support model? Which groups need access to what elements of information? Do you have or have you prepared a proposed workflow of how users, managers, and even customers are going to interact with the new capabilities?

Who is going to take care of the management systems and applications? Have you aligned your organization to be successful in deployment? Do you have the skill sets? Do you have adequate skills coverage?

Have you defined the event flow? What about performance reports needs and distributions? And ad hoc reporting needs? Have you defined any baseline thresholds?

Do you have SNMP access? What about ICMP? SSH? Have you considered the implications of management traffic across your security zones?

Product Choices

While there are a plethora of choices available to you, many do not want to go through the hassle of doing due diligence. But be forewarned, failure to do due diligence can wreak mayhem in you environment. I know, the big guns say that "our product works in your competitors" but does it really? You don't know? As is your competition that undifferentiated from you? (May not be a good thing!)

When you go through product selection, you need to realize the support needed to administer the new management applications. Do you need specialists just to install it? What about training? Are you going to need other resources like Business Intelligence Analysts, Web Developers, Database Administrators, Script Developers, or even additional Analysts or Engineers.

Here are some signs you may experience:

If the product takes longer than a couple of days to install and integrate, here's your sign.

If two or more products in your big vendor product suite need a significant amount of customization to work together, here's your sign.

If the installation document for the product deviates from the actual installation, here's your sign.

If you find out you actually have to install additional product as discovered during the installation, here's your sign.

If you end up realizing that the recommended hardware specs are either overkill or under-speced, here's your sign.

If you end up having to deal with libraries and utilities that are not included or resolved with the product installation, here's your sign.

Missed it by THAT much!

If you find yourself opening up support tickets in the middle of the installation, here's your sign.

If you find that the product breaks your security model AFTER you do the installation, here's your sign.

If it takes Vendor specific Engineering to install the product, here's your sign.

If you cannot see value in the first day after the installation of a product, here's your sign.

If you find that you need to restructure and build out your support team AFTER the installation, here's your sign.

Systems Management

Systems Management brings whole new challenges to your environment. Some of the things you need to evaluate up front are:

Agent deployment - Level of Difficulty - OS Coverage - consistent data across agents. Manual, Automatic, or distribute able
Agent-less - Browser specific? Adequate coverage? Full transactions? Handles redirection?
Agent run time - Resource utilization - memory footprint - stability - Security.
Data collection - Pull or push model? Resiliency? Effect on run time resources?
External Restrictions - Java versions? Perl versions? Python versions?
Adequate application coverage?
Thresholds - Level of difficulty? Binary only or degrees of utilization/capacity/performance? Stateful? Dynamic thresholds? Northbound traps already defined or do you have to do your own?

Summary

Enterprise Management does not have to be that difficult. There are products out there that work very well for what they do and are easy to deploy and maintain. For example, go do an OpenNMS installation. Even though OpenNMS runs on just about any platform (a testament to their developer community and product maturity), you go to their wiki page http://www.opennms.org/documentation/installguide.html , pick out your platform of choice, and follow the procedure. Most of the time, you are looking at maybe an hour. In an hour, you're starting discovery and picking up inventory to monitor and manage.

Solarwinds isn't too bad either. Nice, clean install on Windows.

Splunk is awesome and up in running in no time. http://www.splunk.com/

Hyperic HQ wasn't a bad installation either. Pretty simple. However, it is time sensitive on the agents. Kind of thick (I think its the Struts), Java wise. http://www.hyperic.com/

eGInnovations is cake. One agent everywhere for OS and applications. Handles VMWare, Xen and others. And the UI is straight forward. A Ton of value across both system and application monitoring and performance. http://www.eginnovations.com/

Appliance based solutions take a bit more time in the planning phase up front but take the sting out of installation. Some of these include:

http://www.sevone.com/ (SevOne does offer a software download for evaluation)
http://www.sciencelogic.com/
http://www.loglogic.com/ (They also offer a virtual appliance download)

One solution I dig is Tavve ZoneRanger for solving those access issues like UDP/SNMP across firewalls, SSH access across a firewall, etc., without having to run through proxies upon proxies and still maintain consistent auditing and logging. It deploys as an appliance of virtual appliance. http://www.tavve.com/

Another aspect you may consider include hosted applications. ServiceNow is easy to deploy because it is a hosted solution. http://www.servicenow.com/

Monday, April 9, 2012

Product quality Dilemma

All too often, we have products that we have bought, put in production, and attempted to work through the shortcomings and obfuscated abnormalities prevalent in so many products. (I call this product "ISMs" and I use the term to describe specific product behaviors or personalities.) As part of this long living life cycle, changes, additions, deprecations, and behaviors change over time. Whether its fixing bugs or rewriting functions as upgrades or enhancements, things happen.

All too often, developers tend to think of their creation in a way that may be significantly different than the deployed environments they go into. Its easy to get stuck in microcosms and walled off development environments. Sometimes you miss the urgency of need, the importance of the functionality, or the sense of mandate around the business.

With performance management products, its all too easy just to gather everything and produce reports ad nauseum. With an overwhelming level of output, its easy to get caught up in the flash, glitz, and glamour of fancy graphs, pie charts, bar charts... Even Ishigawa diagrams!

All this is a distraction of what the end user really really NEEDS. I'll give a shot at outlining some basic requirements pertinent to all performance management products.

1. Don't keep trying to collect on broken access mechanisms.

Many performance applications continue to collector attempt to collect, even when they haven't had any valid data in several hours or days. Its crazy as all of the errors just get in the way of valid data. And some applications will continue to generate reports even though no data has been collected! Why?

SNMP Authentication failures are a HUGE clue your app is wasting resources or something simple. Listening for ICMP Source Quenches will tell you if you're hammering end devices.

2. Migrate away from mass produced reports in favor of providing information.

If no one is looking at the reports, you are wasting cycles, hardware,and personnel time on results that are meaningless.

3. If you can't create new reports without code, its too complicated.

All too often, products want to put glue code or even programming environments / IDEs in front of your reporting. Isn't it a stretch to assume that a developer will be the reporting person? Most of the time its someone more business process oriented.

4. Data and indexes should be documented and manageable. If you have to BYODBA (Bring Your Own DBA), the wares vendor hasn't done their home work.

How many times have we loaded up a big performance management application only to find out you have to do a significant amount of work tuning the data and the database parameters just to get the app to generate reports on time?

And you end up having to dig through the logs to figure out what works and what doesn't.

If you know what goes into the database, why do you not put in indexes,checks and balances, and even recommended functions when expansion occurs.

In some instances, databases used by performance management applications are geared toward the polling and collection versus the reporting of information. In many cases, one needs to build data derivatives of multiple elements in order to facilitate information presentation. For example, a simple dynamic thresholding mechanism is to take a sample of a series of values and perform an average, root mean, and standard deviation derivative.

If a reporting person has to do more than one join to get to their data elements, your data needs to be better organized, normalized, and accessible via either derivative tables or a view. Complex data access mechanisms tend to alienate BI and performance / Capacity Engineers. They would rather work the data than work your system.

5. If the algorithm is too complex to explain without a PhD, it is not usable nor trustable.

There are a couple of applications that use patented algorithms to extrapolate bandwidth, capacity, or effective usage. If you haven't simplified the explanation of how it works, you're going to alienate a large portion of your operations base.

6. If an algorithm or method is held as SECRET, it works just until something breaks or is suspect. Then your problem is a SECRET too!

Secrets are BAD. Cisco publishes all of its bugs online specifically because it eliminates the perception that they are keeping something from the customer.

If one remembers Concord's eHealth Health Index... In the earlier days, it was SECRET SQUIRREL SAUCE. Many an Engineer got a bad review or lost their job because of the arrogance of not publishing the elements that made up the Health Index.

7. Be prepared to handle BI types of access. Bulk transfers, ODBC and Excel/Access replication, ETL tools access, etc.

If Engineers are REALLY using your data, they want to use it in their own applications, their own analysis work, and their own business activities. The more useful your data is, the more embedded and valuable your application is. Provide ways of providing shared tables,timed transfers, transformations, and data dumps.

8. Reports are not just a graph on a splash page or a table of data. Reports to Operations personnel means they put text and formatting around the graphs, charts, tables, and data to relate the operational aspects of the environment in with the illustrations.

9. In many cases, you need to transfer data in a transformed state from one system that reports to another. Without ETL tools, your reporting solution kind of misses the mark.

Think about this... You have configuration data and you need this data in a multitude of applications. Netcool. Your CMDB. Your Operational Data Store. Your discovery tools. Your ticketing system. Your performance management system. And it seems that every one of this data elements may be text, XML, databases of various forms and flavors, even HTML. How do you get transformed from one place to another?

10. If you cannot store, archive, and activate polling, collection, threshold, and reporting configurations accurately, you will drive away customization.

As soon as a data source becomes difficult to work with, it gets in the way of progress. In essence, what happens in that when a data source becomes difficult to access, it quits being used beyond its own internal function. When this occurs, you start seeing separation and duplication of data.

The definitions of the data can also morph over time. When this occurs and the data is shared, you can correct it pretty quickly. When data is isolated, many times the problem just continues until its a major ordeal to correct. Reconciliation when there are a significant number of discrepancies can be rough.

Last but not least - If you develop an application and you move the configuration from test/QA to production and it does not stay EXACTLY the same, YOUR APPLICATION is JUNK. Its dangerous, haphazard, incredibly short sided, and should be avoided at all costs. Recently, I had a dear friend develop, test, and validate a performance management application upgrade. After a month in test and QA running many different use case validations, it was put into production. The application overloaded the paired down configurations to defaults upon placement into production, polled EVERYTHING and it caused major outages and major consternation for the business. In fact, heads rolled. The business lost customers. There were people that were terminated. And a lot of man power was expended trying to fix the issues.

In no uncertain terms, I will never let my friends and customers be caught by this product.

Thursday, May 20, 2010

Product Evaluations...

I consider product competition as a good thing. It keeps everyone working to be the best in breed, deliver the best and most cost effective solution to the customer, and drives the value proposition.

In fact, in product evaluations I like to pit vendors products against each other so that my end customer gets the best solution and the most cost effective. For example, I use capabilities that may not have been in the original requirements to further the customer capability refinement. If they run across something that makes their life better, why not leverage that in my product evaluations? In the end, I get a much more effective solution and my customer gets the best product for them.

When faced with using internal resources to develop a capability and using an outside, best of breed solution, danger exists in that if you grade on a curve for internally developed product, you take away competition and ultimately the competitive leadership associated with a Best of Breed product implementation.

It is too easy to start to minimize requirements to the bare necessities and to further segregate these requirements into phases. When you do, you lose the benefit of competition and you lose the edge you get when you tell the vendors to bring the best they have.

Its akin to looking at the problem space and asking what is th bare minimum needed to do this. Or asking what is the best solution for this problem set? Two completely different approaches.

If you evaluate on bare minimums, you get bare minimums. You will always be behind the technology curve in that you will never consider new approaches, capabilities, or technology in your evaluation. And your customer is always left wanting.

It becomes even more dangerous when you evaluate internally developed product versus COTS in that, if you apply the minimum curve gradient to only the internally developed product, the end customer only gets bare minimum capabilities within the development window. No new capabilities. No new technology. No new functionality.

It is not a fair and balanced evaluation anyway if you only apply bare minimums to evaluations. I want the BEST solution for my customer. Bare minimums are not the BEST for my customer. They are best for the development team because now, they don't have to be the best. They can slow down innovation through development processes. And the customer suffers.

If you're using developer in house, it is an ABSOLUTE WASTE of company resources and money to develop commodity software that does not provide clear business discriminators. Free is not a business discriminator in that FREE doesn't deliver any new capabilities - capabilities that commodity software doesn't already have.

Inherently, there are two mindsets that evolve. You take away or you empower the customer. A Gatekeeper or a Provider.

If you do bare minimums, you take away capabilities that the customer wants but because it may not be a bare minimum, the capability is taken away.

If you evaluate on Best of Breed, you ultimately bring capabilities to them.