Wednesday, May 26, 2010

Tactical Integration Decisions...

Inherently, I am a pre-cognitive Engineer. I think about and operate in a future realm. It is they way I work best and achieve success.

Recently I became aware of a situation where a commercial product replaced an older, in house developed product. This new product has functionality and capabilities well beyond that of the "function" it replaced.

During the integration effort, it was realized that a bit of event / threshold customization was needed on the SNMP Traps front in order to get Fault Management capabilities into the integration.

In an effort to take a short cut, it was determined that they would adapt the new commercial product to the functionality and limitations of the previous "function". This is bad for several reasons:

1. You limit the capabilities of the new product going forward to those functions that were in the previous function. No new capabilities.

2. You taint the event model of the commercial product to that of the legacy function. Now all event customizations have to deal with old OIDs, and varbinds.

3. You limit the supportability and upgrade-ability of the new product. Now, every patch, upgrade, and enhancement must be transitioned back to the legacy methodology.

4. It defies common sense. How can you "let go" of the past when when you readily limit yourself to the past?

5. You assume that this product cannot provide any new value to the customer or infrastructure.

You can either do things right the first time or you can create a whole new level of work that has to be done over and over. People that walk around backwards are resigned to the past. They make better historians than Engineers or Developers. How do you know where to go if you can't see anything but your feet or the end of your nose?

Saturday, May 22, 2010

Support Model Woes

I find it ironic that folks claim to understand ITIL Management processes yet do not understand the levels of support model.

Most support organizations have multiple ters of support. For example, there is usually a Level 1 which is the initial interface toward the customer. Level 2 is usually a traige or technical feet on the street. Level 3 is usually Engineering or Development. In some cases, Level 4 is used to denote on site vendor support or third party support.

In organizations where Level 1 does dispatch only or doesn't follow through problems with the customer, the customer ends up owning and following through the problem to solution. What does this say to customers?

- There are technically equal to or better than level 1
- they become accustomed to automatic escalation.
- They lack confidence in the service provider
- They look for specific Engineers versus following process
- They build organizations specifically to follow and track problems through to resolution.

If your desks do dispatch only, event management systems are only used to present the initial event leading up to a problem. What a way to render Netcool useless! Netcool is designed to display the active things that are happening in your environment. If all you ever do is dispatch, why do you need Netcool? Just generate a ticket straight away. No need to display it.

What? Afraid of rendering your multi-million dollar investment useless? Why leave it in a disfunctional state of semi-uselessness when you could save a significant amount of money getting rid of it? Just think, every trap can become an object that gets processed like a "traplet" of sorts.

One of the first things that crops up is that events have a tendency to be dumb. Somewhere along the way, somebody had the bright idea that to put an agent into production that is "lightweight" - meaning no intelligence or correlation built in. Its so easy to do little or nothing, isn't it? Besides, its someone elses problem to deal with the intelligence. In the
vendors case, many times agent functionality is an afterthought.

Your model is all wrong. And until the model is corrected, you will never realize the potential ROI of what the systems can provide. You cannot evolve because you have to attempt to retrofit everything back to the same broken model. And when you automate, you automate the broken as well.

Heres the way it works normally. You have 3 or 4 levels of support.

Level 1 is considered first line and they perform the initial customer engagement, diagnostics and triage, and initiate workflow. Level 1 initiates and engages additional levels of support trackingt thingts through to completion. In effect, level 1 owns the incident / problem management process but also provides customer engagement and fulfillment.

Level 2 is specialized support for various areas like network, hosting, or application support. They are engaged through Level 1 personnel and are matrixed to report to level 1, problem by problem such that they empower level 1 to keep the customer informed of status and timelines, set expectations, and answer questions.

Level 3 is engaged when the problem becomes beyond the technical capabilities of levels 1 and 2, requires project, capital expenditure, architecture, and planning support.

Level 4 is reserved for Vendor support or consulting support and engagement.

A Level 0 ia used to describe automation and correlation performed before workflow is enacted.

When you breakdown your workflow into these levels, you can start to optimize and realize ROI by reducing the cost of maintenance actions across the board. By establishing goals to solve 60-70% of all incidents at LEvel 1, Level 2-4 involvement helps to drive knowledge and understanding downward to level 1 folks why better utilizing level 2 - 4 folks.

In order to implement these levels of support, you have organize and define your support organization accordingly. Define its rolls and responsibilities, set expectations, and work towards success. Netcool, as an Event Management platform, need to be aligned to the support model. Things that ingress and egress tickets need to be updated in Netcool. Workflow that occurs, needs to update Netcool so that personnel have awareness of what is going on.

Politically based Engineering Organizations

When an organization is resistant to change and evolution, it can in many cases be attributed to weak technical leadership. Now this weakness can be because of the politicalization of Engineering and Development organizations.

Some of the warning signs include:

- Political assasinations to protect the current system
- Senior personnel and very experienced people intimidate the management structure and are discounted
- Inability to deliver
- An unwillingness to question existing process or function.


People that function at a very technical level find places where politics prevalent, a difficult place to work.

- Management doesn't want to know whats wrong with their product.
- They don't want to change.
- They shun and avoid senior technical folks and experience is shunned.

Political tips and tricks

- Put up processes and negotiations to stop progress.
- Random changing processes.
- Sandbagging of information.
- When something is brought to the attention of the Manager, the technical
person's motivations are called into question. What VALUE does the person bring to the Company?
- The person's looks or mannerisms are called into question.
- The persons heritage or background is called into question.
- The person's ability to be part of a team is called into question.
- Diversions are put in place to stop progress at any cost.

General Rules of Thumb

+ Politically oriented Supervisors kill technical organizations.
+ Image becomes more important than capability.
+ Politicians cannot fail or admit failure. Therefore, risks are avoided.
+ Plausible deniability is prevalent in politics.
+ Blame-metrics is prevalent. "You said..."

Given strong ENGINEERING leadership, technical folks will grow very quickly. Consequently, their product will become better and better as you learn new techniques and share them, everyone gets smarter. True Engineers have a willingness to help others, solve problems, and do great things. A little autonomy and simple recognitions and you're off to the races.

Politicians are more suited to Sales jobs. Sales is about making the numbers at whatever cost. In fact, Sales people will do just about anything to close a deal. They are more apt to help themselves than to help others unless it also helps themselves. Engineers need to know that their managers have their backs. Sales folks have allegiance to the sale and themselves... A bad combination for Engineers.

One of the best and most visionary Vice Presidents I've ever worked under, John Schanz once told us in a Group meeting that he wanted us to fail. Failure is the ability to discover what doesn't work. So, Fail only once. And be willing to share your lessons, good and bad. And dialog is good. Keep asking new questions.


In Operations, Sales people can be very dangerous. They have the "instant gratification" mentality in many cases. Sign the PO, stand up the product, and the problem is solved in their minds. They lack the understanding that true integration comes at a personal level with each and every user. This level of integration is hard to achieve. And when things don't work or are not accepted, folks are quick to blame someone or some vendor.

The best Engineers, Developers, and Architects are the ones that have come up through the ranks. They have worked in a NOC or Operations Center. They have fielded customer calls and problems. They understand how to span from the known to and unknown even when the customer is right there. And they learn to think on their feet.

Thursday, May 20, 2010

Architecture and Vision

One of the key aspects of OSS/ENMS Architecture is that you need to build a model that is realizable in 2 major product release cycles, of where you'd like to be, systems, applications, and process wise. This equates usually to an 18-24 month "what it needs to look like" sort of vision.

Why?

- It provides a reference model to set goals and objectives.
- It aligns development and integration teams toward a common roadmap
- It empowers risk assessment and mitigation in context.
- It empowers planning as well as execution.

What happens if you don't?

- You have a lot of "cowboy" and rogue development efforts.
- Capabilities are built in one organization that kill capabilities and performance in other products.
- Movement forward is next to impossible in that when something doesn't directly agree with the stakeholder's localized vision, process pops up to block progress.
- You create an environment where you HAVE to manage to the Green.
- Any flaws or shortcomings in localized capabilities results in fierce political maneuvering.

What are the warning signs?

- Self directed design and development
- Products that are deployed in multiple versions.
- You have COTS products in production that are EOLed or are no longer supported.
- Seemingly simple changes turn out to require significant development efforts.
- You are developing commodity product using in house staff.
- You have teams with an Us vs. Them mentality.

Benefits?

- Products start to become "Blockified". Things become easier to change, adapt, and modify.
- Development to Product to Support to Sales becomes aligned. Same goals.
- Elimination of a lot of the weird permutations. No more weird products that are marsupials and have duck bills.
- The collective organizational intelligence goes up. Better teamwork.
- Migration away from "Manage to the Green" towards a Teamwork driven model.
- Better communication throughout. No more "Secret Squirrel" groups.

OSS ought to own a Product Catalog, data warehouse, and CMS. Not a hundred different applications. OSS Support should own the apps the the users should own the configurations and the data as these users need to be empowered to use the tools and systems as they see fit.

Every release of capability should present changes to the product catalog. New capabilities, new functions, and even the loss of functionality, needs to be kept in synch with the product teams. If I ran product development, I'd want to be a lurker on the Change Advisory Board and I'd want my list published and kept up to date at all times. Add a new capability. OSS had BETTER inform product teams.

INUG Activities...

Over the past few weeks, I've made a couple of observations regarding INUG traffic...

Jim Popovitch A Stalwart in the Netcool community, left IBM and the Netcool product behind to go to Monolith! What the hell does that say?

There was some discussion of architectural problems with Netcool and after that - CRICKETS. Interaction on the list by guys like Rob Cowart, Victor Havard, Jim - Silence. Even Heath Newburn's posts are very short.

There is a storm brewing. Somebody SBDed the party. And you can smell I mean tell. The SBD was licensing models. EVERYONE is checking their shoes and double checking their implementations. While Wares vendors push license "true ups" as a way to drive adoption and have them pay later, on the user side it is seen as Career Limiting as it is very difficult to justify your existence when you have to go back to the well for an unplanned budget item at the end of the year.

Something is brewing because talk is very limited.

Product Evaluations...

I consider product competition as a good thing. It keeps everyone working to be the best in breed, deliver the best and most cost effective solution to the customer, and drives the value proposition.

In fact, in product evaluations I like to pit vendors products against each other so that my end customer gets the best solution and the most cost effective. For example, I use capabilities that may not have been in the original requirements to further the customer capability refinement. If they run across something that makes their life better, why not leverage that in my product evaluations? In the end, I get a much more effective solution and my customer gets the best product for them.

When faced with using internal resources to develop a capability and using an outside, best of breed solution, danger exists in that if you grade on a curve for internally developed product, you take away competition and ultimately the competitive leadership associated with a Best of Breed product implementation.

It is too easy to start to minimize requirements to the bare necessities and to further segregate these requirements into phases. When you do, you lose the benefit of competition and you lose the edge you get when you tell the vendors to bring the best they have.

Its akin to looking at the problem space and asking what is th bare minimum needed to do this. Or asking what is the best solution for this problem set? Two completely different approaches.

If you evaluate on bare minimums, you get bare minimums. You will always be behind the technology curve in that you will never consider new approaches, capabilities, or technology in your evaluation. And your customer is always left wanting.

It becomes even more dangerous when you evaluate internally developed product versus COTS in that, if you apply the minimum curve gradient to only the internally developed product, the end customer only gets bare minimum capabilities within the development window. No new capabilities. No new technology. No new functionality.

It is not a fair and balanced evaluation anyway if you only apply bare minimums to evaluations. I want the BEST solution for my customer. Bare minimums are not the BEST for my customer. They are best for the development team because now, they don't have to be the best. They can slow down innovation through development processes. And the customer suffers.

If you're using developer in house, it is an ABSOLUTE WASTE of company resources and money to develop commodity software that does not provide clear business discriminators. Free is not a business discriminator in that FREE doesn't deliver any new capabilities - capabilities that commodity software doesn't already have.

Inherently, there are two mindsets that evolve. You take away or you empower the customer. A Gatekeeper or a Provider.

If you do bare minimums, you take away capabilities that the customer wants but because it may not be a bare minimum, the capability is taken away.

If you evaluate on Best of Breed, you ultimately bring capabilities to them.

Tuesday, May 18, 2010

IT Managed Services and Waffle House

OK.

So now you're thinking - What the hell does Waffle House have to do with IT Managed Services. Give me a minute and let me 'splain it a bit!

When you go into a Waffle House, immediately you get greeted at the door. Good morning! Welcome to Waffle House! If the place is full, you may have a door corps person to seat you in waiting chairs and fetch you coffee or juice while you're waiting.

When you get to the table, the Waitress sets up your silverware and ensures you have something to drink. Asks you if you have decided on what you'd like.

When they get your order, they call in to the cook what you'd like to eat.

A few minutes later, food arrives, drinks get refilled, and things are taken care of.

Pretty straightforward, don't you think? One needs to look at the behaviors and mannerisms od successful customer service representatives to see what behaviors are needed in IT Managed Services.

1. The Customer is acknowledged and engaged at the earliest possible moment.

2. Even if there are no open tables, work begins to establish a connection and a level of trust.

3. The CSR establishes a dialog and works to further the trust and connection. To the customer, they are made to feel like they are the most important customer in the place. (Focus, eye contact. Setting expectations. Assisting where necessary.)

4. They call in the order to the cook. First, meats are pulled as they take longer to cook. Next, if you watch closely, the cook lays out the order using a plate marking system. The customer then prepares the food according to the plate markings.

5. Food is delivered. Any open ends are closed. (drinks)

6. Customer finishes. Customer is engaged again to address any additional needs.

7. Customer pays out. Satisfied.

All too often, we get IT CSRs that don't readily engage customers. As soon as they assess the problem, they escalate to someone else. This someone else then calls the customer back, reiterates the situation, then begins work. When they cannot find the problem or something like a supply action of technician dispatch needs to occur, the customer gets terminated and re-initiated by the new person in the process.

In Waffle House, if you got waited on by a couple of different waitresses and the cook, how inefficient would that be? How confusing would that be as a customer? Can you imagine the labor costs to support 3 different wait staff even in slow periods? How long would Waffle House stay in business?

Regarding the system of calling and marking out product... This system is a process thats taught to EVERY COOK and Wait staff person in EVERY Waffle House EVERYWHERE. The process is tried, vetted, optimized, and implemented. And the follow through is taken care of by YOUR Wait person. The one thats focused on YOU, the Customer.

I wish I could take every Service provider person and manager and put them through 90 days of Waffle House Boot Camp. Learn how to be customer focused. Learn urgency of need, customer engagement, trust, and services fulfillment. If you could thrive at Waffle House and take with you the customer lessons, customer service in an IT environment should be Cake.

And it is living proof that workflow can be very effective even at a very rudimentary level.

Tuesday, May 11, 2010

Blocking Data

Do yourself a favor.

Take a 10000 line list of names, etc. and put it in a Database. Now, put that same list in a single file.

Crank up a term window and a SQL client on one and command line on another. Pick a pattern that gets a few names from the list.

Now in the SQL Window, do a SELECT * from Names WHERE name LIKE 'pattern';

In a second window, do a grep 'pattern' on the list file.

Now hit return on each - simultaneously if possible. Which one did you see results first?

The grep, right!!!!

SQL is Blocking code. The access blocks until it returns with data. If you front end an application with code that doesn't do callbacks right and doesn't handle the blocking, what happens to your UI? IT BLOCKS!

Blocking UIs are what? Thats right! JUNK!

Sunday, May 9, 2010

ENMS Products and Strategies

Cloud Computing is changing the rules of how IT and ENMS products are done. With Cloud Computing, resources being configurable and adaptable very quickly, applications are also changing to fit the paradigm of fitting in a virtual machine instance. We may even see apps deployed with their own VMs as a package.

This also changes the rules of how Service providers deliver. In the past, you had weeks to get the order out, setup the hardware, and do the integration. In the near future, all Service providers and application wares vendors will be pushed to Respond and Deliver at the speed of Cloud.

It changes a couple of things real quickly:

1. He who responds and delivers first will win.
2. Relationships don't deliver anything. Best of Breed is back. No excuse for the blame game.
3. If it takes longer for you to fix a problem than it does to replace, GAME OVER!
4. If your application takes too long to install or doesn't fit in the Cloud paradigm, it may become obsolete SOON.
5. In instances where hardware is required to be tuned to the application, appliances will sell before buying your own hardware.
6. In a Cloud environment, your Java JRE uses resources. Therefore it COSTS. Lighter is better. Look for more Perl, Python, PHP and Ruby. And Javascript seems to be a language now!

Some of the differentiators in the Cloud:

1. Integrated system and unit tests with the application.
2. Inline knowledge base.
3. Migration from tool to a workflow strategy.
4. High agility. Configurability. Customizeability.
5. Hyper-scaling technology like memcached and distributed databases incorporated.
6. Integrated products on an appliance or VM instance.
7. Lightweight and Agile Web 2.0/3.0 UIs.

SNMP MIBs and Data and Information Models

Recently, I was having a discussion about SNMP MIB data and organization and its application into a Federated CMDB and thought it might provoke a bit of thought going forward.

When you compile a MIB for a management application, what you do is to organize the definitions and the objects according to name a OID, in a way thats searchable and applicable to performing polling, OID to text interpretation, variable interpretation, and definitions. In effect, you compiled MIB turns out to be every possible SNMP object you could poll from in your enterprise.

This "Global Tree" has to be broken down logically with each device / agent. When you do this, you build an information model related to each managed object. In breaking this down further, there are branches that are persistent for every node of that type and there are branches that are only populated / instanced if that capability is present.

For example, on a Router that has only LAN type interfaces, you'd see MIB branches for Ethernet like interfaces but not DSX or ATM. These transitional branches are dependent upon configuration and presence of CIs underneath the Node CI - associated with the CIs corresponding to these functions.

From a CMDB Federation standpoint, a CI element has a source of truth from the node itself, using the instance provided via the MIB, and the methods via the MIB branch and attributes. But a MIB goes even further to identify keys on rows in tables, enumerations, data types, and descriptive definitions. A MIB element can even link together multiple MIB Objects based on relationships or inheritance.

In essence, I like the organization of NerveCenter Property Groups and Properties:

Property Groups are initially organized by MIB and they include every branch in that MIB. And these initial Property Groups are assigned to individual Nodes via a mapping of the system.sysObjectID to Property Group. The significance of the Property Group is that is contains a list of the MIB branches applicable to a given node.

These Property Groups are very powerful in that it is how Polls, traps, and other trigger generators are contained and applied according to the end node behavior. For example, you could have a model that uses two different MIB branches via poll definitions, but depending the node and its property group assignment, only the polls applicable to the node's property group, are applied. Yet it is done with a single model definition.

The power behind property groups was that you could add custom properties to property groups and apply these new Property Groups on the fly. So, you could use a custom property to group a specific set of nodes together.

I have setup 3 distinct property groups in NerveCenter corresponding to 3 different polling interval SLAs and used a common model to poll at three different rates dependent upon the custom properties for Cisco_basic, Cisco_advanced, and Cisco_premium to poll at 2 minutes, 1 minute, or 20 seconds respectively.

I used the same trigger name for all three poll definitions but set the property to only be applicable to Cisco_basic, Cisco_advanced, or Cisco_premium respectively.

What Property Groups do is to enable you to setup and maintain a specific MIB tree for a given node type. Taking this a bit further, in reality, every node has its own MIB tree. Some of the tree is standard for every node of the same type while other branches are option or capability specific. This tree actually corresponds to the information model for any given node.

Seems kinda superfluous at this point. If you have an information model, you have a model of data elements and the method to retrieve that data. You also have associative data elements and relational data elements. Whats missing?

Associated with these CIs related to capabilities like a DSX interface or an ATM interface or even something as mundane as an ATA Disk drive, are elements of information like technical specifications and documentation, process information, warranty and maintenance information.. even mundane elements like configuration notes.

So, when you're building your information model, the CMDB is only a small portion of the overall information system. But it can be used to Meta or cross reference other data elements and help to turn these into a cohesive information model.

This information model ties well into fault management, performance management, and even ITIL Incident, Problem and change management. But you have to think of the whole as an Information model to make things work effectively.

Wouldn't it be awesome if to could manage and respond by service versus just a node? When you get an event on a node, do you enrich the event to provide the service or do you wait until a ticket is open? If the problem was presented as a service issue, you could work it as a service issue. For example, if you know the service lineage or pathing, you can start to overlay elements of information that empower you to start to put together a more cohesive awareness of your service.

Lets say you have a simple 3 tier Web enabled application that consists of a Web Server, and application Server, and a Database. On the periphery, you have network components, firewalls, switches, etc. How valuable is just the lineage? Now, if I can overlay information elements on this ontology, it comes alive. For example, show me a graph of CPU performance on everything in the lineage. Add in memory and IO utilization. If I can overlay response times for application transactions, the picture becomes indispensable as an aid to situation awareness.

Looking at things from a different perspective, what if I could overlay network errors. Or disk errors. What about other, seemingly irrelevant elements of information like ticket activities or the amount of support time spent on each component in the service lineage, takes data and empowers you to present it in a service context.

On a BSM front, what if I could overlay the transaction rate with CPU, IO, Disk, or even application memory size or CPU usage? Scorecards start becoming a bit more relevant it would seem.

In Summary

SNMP MIB data is vital toward not only the technical aspects of polling and traps but conversion and linkage to technical information. SNMP ius a source of truth for a significant number of CIs and the MIB definitions tell you how the information is presented.

But all this needs to be part of an Information plan. How do I take this data, derive new data and information, and present it to folks when they need it the most?

BSM assumes that you have the data you need organized in a database that can enable you to present scorecards and service trees. Many applications go through very complex gyrations on SQL queries in an attempt to pull the data out. When the data isn't there or it isn't fully baked, BSM vendors may tend to stub up the data to show that the application works. This gets the sale but the customer ends up finding that the BSM application isn't as much out of the box as the vendor said it was.

These systems depend on Data and information. Work needs to be done to align and index data sources toward being usable. For example, if you commonly use inner and outer joins in queries, you haven't addressed making your data accessible. If it takes elements from 2 tables to do selects on others, you need to work on your data model.

Monday, May 3, 2010

Event Processing...

I ran across this blog post by Tim Bass on Complex Event Processing dubbed "Orwellian Event Processing" and it struck a nerve.

The rules we build into products like Netcool Omnibus, HP Openview, and others are all based on simple if-then-else logic. Yet, through fear that somebody may do something bad with an event loop, recursive processing is shunned.

In his blog, he describes his use of Bayesian Belief Networks to learn versus the static if-then-else logic.

Because BBNs learn patterns through evidence and though cause and effect, the application of BBNs in common event classification and correlation systems makes total sense. And the better you get at classification, the better you get at dealing with uncertainty.

In its simplest form, BBNs output a ratio of occurrences ranging from 1 to -1 where 1 is 100 percent that an element in a pattern occurs and -1 or 100% that an element in a pattern never occurs.

The interesting part is that statistically, a BBN will recognize patterns that a human cannot. We naturally filter out things that are obfuscated or don't appear relevant.

What if I could have a BBN build my rules files for Netcool based upon statistical analysis of the raw event data? What would it look like as compared to current rule sets? Could I setup and establish patterns that lead up to an event horizon? Could I also understand the cause an effect of an event? What would that do to the event presentation?

Does this not open up the thought that events are presented in patterns? How could I use that to drive up the accuracy of event presentation?

Sunday, May 2, 2010

Cloud Computing - Notes from a guy behind the curtain

The latest buzz is Cloud computing. When I attended a CloudCamp hosted here in St. Louis, it became rather obvious that the term Cloud computing has an almost unlimited supply of definitions depending upon which Marketing Dweeb you talk to. It can range from everything from hosted MS Exchange to hosted VMs to applications and even hosted services. I'm really not in a position to say what Cloud is or isn't and in fact, I don't believe theres any way to win that argument. Cloud computing is a marketing perception that is rapidly morphing into whatever marketing types deem necessary to sell something. right or wrong - the spin doctors own the term and it is whatever they think will sell.

In my own perception, Cloud Computing is a process by which applications, services, and infrastructure are delivered to a customer in a rapid manner and empowers the customer to pay for what they use in small, finite increments. Cloud Computing incorporates a lot of technologies and process to make this happen. Technology like Virtualization, configuration management databases, hardware, and software.

What used to take days or weeks to deliver now takes minutes. MINUTES. What does this mean? It takes longer to secure an S Corp and setup a corresponding Tax ID than it does to setup and deliver a new companies web access to customers. And not just local, down home Main street customers but you are in business, competing at a GLOBAL LEVEL in MINUTES. And you pay as you go!

Sounds tasty, huh. Now heres a kink. How the heck do you manage this? I know the big Four Management companies say a relationship is more important than Best of Breed. I've heard in in numerous presentations and conversations. If you are in a position to sit still in business, this may be OK for you. Are you so secure in your market space that you do not fear competition to the point where you would sit idly?

Their products reflect this same lack of concern in that it is the same old stuff. It hasn't evolved much - it takes forever to get up and running and it takes months to be productive. For example, IBM Tivoli ITNM/IP. Takes at least a week of planning just to get ready to install it - IF you have hardware. Next, you need another week and consulting to get things cranking on a discovery for the first time. Takes weeks to integrate in your environment dealing with community string issues, network access, and even dealing with discovery elements.

Dealing with LDAP and network views is another nightmare altogether.

The UI is clunky, slow, and non-interactive. Way too slow to be used interactively as a diagnostic tool. At least you could work with NNM in earlier versions to get some sort of speed. (Well, with the Web UI, you had HP - slower than molasses - and then when you needed something that worked you bought Edge Technologies or Onion Peel.) In the ITNM/IP design, somebody in their infinite wisdom decided to store the map objects and map instances in a binary glob field in MySQL. At least if you had the coordinates you could FIX the topoviz maps or even display them in something a bit faster and more Web 2.0 - Like Flash / Flex. (Hardware is CHEAP!)

And how do you apply this product to a cloud infrastructure? If you can only discover once every few days, I think you're gonna miss a few customers setting up new infrastructures and not to mention any corresponding VMotion events that occur when things fail or load balance. How do you even discover and display the virtual network infrastructure with the real network infrastructure?

Even if you wanted to use it like TADDM, Tideway, or DDMi, the underlying database is not architected right. It doesn't allow you to map out the relationships between entities enough to make it viable. Even if you did a custom Discovery agent and plugged in NMap - (Hey! Everybody uses it!) you cannot fit the data correctly. And it isn't even close to the CIM schema.

And every time you want some additional functionality like performance data integration, its a new check and a new ball game. They sort of attempt to address this by enabling short term polling via the UI. Huge Fail. How do you look at data from yesterday? DOH!

ITNM/IP + Cloud == Shelfware.

If we are expected to respond at the speed of Cloud, there is a HUGE Pile of Compost consisting of management technology of the past that just isn't going to make it. These products take too much support, take too much resources to maintain, and they hold back innovation. The cost just doesn't justify the integration. Even the products we considered as untouchables. Many have been architected in a way that paints them in a corner. Once you evolve a tool kit into a solution, you have to take care not to close up integration capabilities along the way.

They take too long to install, take a huge level of effort to keep running, and the yearly maintenance costs can be rather daunting. The Cloud methodology kind of changes the rules a bit. In the cloud, its SaaS. You sign up for management. You pay for what you get. If you don't like it or want something else, presto changeo - an hour later, you're on a new plan! AND you pay as you go. No more HUGE budget outlays, planning, and negotiation cycles. No more "True-Ups" at the end of the year that kill your upward mobility and career.

BMC - Bring More Cash
EMC - EXtreme Monetary Concerns
IBM - I've Been Mugged!
HP - Huge PriceTag!
CA - Cash Advanced!

Think about Remedy. Huge Cash outlay up front. Takes a long time to get up and running. Takes even longer to get into production. Hard to change over time. And everything custom becomes an ordeal at upgrade time.

They are counting on you to not be able to move. That you have become so political and process bound, you couldn't replace it if you wanted to. In fact, in the late 80s and early 90s, there was the notion that the applications that ran on mainframes could never be moved off of those huge platforms. I remember working on the transition from Space Station Freedom to International Space Station Information systems. The old MacDac folks kept telling up there was no way we could move to open systems. Especially on time and under budget. 9 months later, 2 ES9000 Duals and a bunch of Vaxes repurposed. 28 Applications migrated. Reduced support head count from over 300 to 70. And it was done with less than half of the cost of software maintenance for a year. New software costs ~15% of what they were before. And we had alot more users. And it was faster too! NASA. ESA. CSA. NASDA. RSA. All customers.

Bert Beals, Mark Spooner, and Tim Forrester are among my list of folks that had a profound effect on my career in that they taught me through example to keep it simple and that NOTHING is impossible. And to keep asking "And then what?"

And while not every app fits in a VM, there is a growing catalog of appliance based applications that make total sense. You get to optimize the hardware according to the application and its data. That first couple of months of planning, sizing, and procurement - DONE.

And some apps thrive on the Cloud virtualization. If you need a data warehouse or are looking to make sense of your data footprint, check out Greenplum. Distributed Database BASED on VMs! You plug in resources as VMs as you grow and change!

And the line between the network and the systems and the applications and the users - disappearing quickly. Presents an ever increasing data challenge to be able to discover and use all these relationships to deliver better services to customers.

Cloud Computing is bringing that revolution and reinvention cycle back into focus in the IT industry. It is a culling event as it will cull out the non-producers and change the customer engagement rules. Best of Breed is back! And with a Vengeance!