I Take Back Every Joke I Ever Made About Cleveland

May 9th, 2008

Yesterday, I spent about four hours (it was only supposed to be two hours) talking to a user group in Cleveland called E-STORM.  I got the chance to try out some material that I will be presenting at Storage Decisions in Chicago next week.

The feedback was tremendous and very helpful.  In the audience were IT management and IT practitioners who were quite vocal about their points of agreement and disagreement with the content I was presenting.  Some vendor sponsors of the group commented that they thought the presentations were right on point, while others (I learned through back channels) dismissed some of what I had to say as one man’s opinion — which is exactly how I presented it.

To summarize, I observed that a strategy going forward for IT came down to three things:

  1. Start purpose-building infrastructure so that resources and services map to what your business processes and applications really need, rather than what equipment vendors want to sell you.  I said that such an initiative needed to respect hardware refresh cycles (progressing like waves on an beach rather than rip-and-replace) and would likely take years to complete, but it had to start now.  This is infrastructure right-sizing that maximizes value to the business and may require parting with certain vendors or at least re-thinking the popular dictum about managing a few vendors rather than technology and architecture, which has become the idiotic advice of Gartner (probably owing to its vendor-leaning business model).  In short, stop buying more features and functions than you use and stop buying one-size-fits-most technologies that don’t fit your needs very well.
  2. For the first thing to work, you need a solid web-services based management story.  Applications, hosting platforms and storage need to be integrated from a management perspective if we are going to be able to purpose build and reduce OPEX costs.  I noted that Xiotech’s web services based ICON Manager has shown the way.
  3. The third component is data management.  This is really the core of the strategy.  We need to classify data at point of origin and expose it to appropriate resources and services based on policies derived from business process (retention, deletion, criticality, nonrepudiability, preservation) requirements and application (access, performance, security, scalability) requirements.  Doing so in the retention tier is probably the best place to begin, the low hanging fruit for technologies like intelligent archive, but we also need to get real about capture storage too.

While I see nothing but common sense in this proposal, I recognize that it is viewed as radical by the vendor community, which makes its nut from selling more than the customer needs — and typically at a significant mark-up.  We discussed these points too.

I wish there was an E-STORM in every city.  This group enables and empowers IT professionals to share their problems, their experiences with products, and their operational insights at every meeting.  The organization also goes to bat for its own:  I was told that, following a discussion in which one member talked about his problems with vendor XYZ’s equipment, the vendor went to the person’s manager and tried to have the person fired.  Representatives of the person went to his boss and explained that the forum provided by the group is for the benefit of all and represents about the only hedge there is today against vendor bullshit.  Closing down this avenue for free and unfettered information exchange would have a chilling effect on the ability of IT professionals to learn from one another and to formulate criteria for evaluating the value and utility of vendor products.  In the end, the person was allowed to return to the group and there hasn’t been any issue since.

Another thing I like about this group is their embrace of the broadest number of technology options and subjects. Mainframers and open systems folk share ideas and alternatives without exclusionary bias. 

Moreover, while the majority of the members are IT practitioners, many participants are business managers.  They routinely engage in mutually beneficial dialog that we should be having to heal the front office back office rift that plagues too many companies.

Do you want to feel good about your IT career again, to be reinvigorated and reaffirmed in your choice to become a member of that much-dissed sect of technologists, then E-STORM may be a group you want to join.

Great Trip

May 9th, 2008

The CDW/TechTarget event in Pittsburgh was extremely well attended and went well.  The panel discussion included EMC and I found myself biting my tongue a bit as the fellow repping the company spent about five minutes extolling the virtues of Centera, a product that still seems to have issues.  The rules of engagement at this event precluded me from “badgering the witness,” as I am somewhat inclined to do, I guess.  So, I had to chill out while he described the product in rather utopian terms and explained that it did not have a closed API for taking data off the platform.  All you need to do is buy Disk Xtender, another EMC product, and you can make a backup, migrate data, etc.

You know, it would really help me sleep better at night if someone knowledgeable at EMC would take time to answer the unanswered questions posted here about the Centera platform.  I know that Chuck Hollis, one of their bloggers, reads this blog because he cites it when it suits him — if only to make an ad hominem attack on me.  Surely they know that customers are informing me of their problems with Centera, which I dutifully report here, and that I am willing to entertain the notion that some of these problem reports are possibly linked to older software versions since resolved or even to good old fashioned user error.  However, their total silence in response to our questions makes me wonder if they are hiding something. 

Questions are not opinions, Chuck.  Opinions are formed when data is received from users, follow-up questions keyed to the data go unanswered by the vendor, and possible explanations for both the problem reports and vendor silence are considered within the context of our understanding of CAS generally and of the marketing practices of the vendor community.  If I maintain a negative-to-hostile opinion of Centera (though not necessarily of all EMC products or services), it is for this reason.  The questions raised in the earlier blog post are hanging out there like a sore thumb.  In conversations with many IT folk, it is clear that these questions are on the minds of potential consumers, so I would think that they are worthy of response.  Even Network Appliance, with whom I have had my share of differences, has done a superb job of compartmentalizing their disdain of some of the views advanced by this blog and responding in a very complete way to questions we pose.  Why not EMC?

There are those who have told me that EMC sells to the Front Office, not to IT — speculating that the former usually lack the technology acumen to render an intelligent tech decision, while the latter wouldn’t touch products like Centera with a ten foot pole.  This would certainly explain why EMC sees ”no upside” to discussing their products here.  Front Office folk probably have better things to do than to seek a better understanding of the products on which they are spending their money, right?  So, they probably don’t read this blog.

However, I have to believe that in some cases, senior management decisions regarding technology are being advised by IT.  While I earlier cited a Deloitte study to the contrary (that is, presenting the case that Front Office decisions are lacking IT guidance), I am told that PWC just released a report showing the opposite:  that in many cases, the opinions and views of IT are considered in Front Office decision making around technology.  In case EMC is interested in picking up some business in accounts where IT does share its views and concerns with Senior Management, wouldn’t it be advantageous to clear the air on these Centera issues? 

For your convenience again, here are the questions, some of which have been addressed by third parties in the message thread around the original post.

  1. We have statements from users that the loss of a drive or node in a Centera may result in protracted (multi day, week or month) rebuild times for the index and data?  If said reports are “dated” as you claim, what have you changed in the product that rectifies these earlier deficits?
  2. We have statements from users that after a certain number of objects have been ingested into Centera, header hash collisions are seen that prevent access to two or more objects that have the same hash value.  If said reports are “dated” as you claim, what have you changed in the product that prevents such hash collisions from occurring?  On the chance that header hash collisions do occur, how can objects with the same hash value be accessed?
  3. We have statements from users that writing data to a Centera is a straightforward affair. (It is just another disk storage target.)  However, some complain of ingestion rates.  What is the rated write speed of a Centera and what are the gating factors on performance as the platform is scaled horizontally and vertically?
  4. We have statements from users that migrating data out of a Centera and into another platform is challenging because the API is closed and only certain products, such as Legato software, can be used to pull data out of the platform.  Is this true?  Does Centera sport an open API for ingestion, but a closed API for extraction?  How do I pull data off the platform if I end of life the product in my shop and want to use a different repository (not EMC)?
  5. We have statements from users that Centera has been represented as “certified compliant.”  Since there are no regulatory or legal entities to our knowledge that have certified the “compliance” of any hardware/software platform with respect to data retention regulations or laws, how can EMC make this claim?

On to E-STORM in Cleveland…

Traveling

May 5th, 2008

Flying to Pittsburgh tonight to do a talk for TechTarget and CDW tomorrow.  Then, off to Cleveland (with a side trip to NYC) for a presentation to the E-STORM User Group on Thursday.  Hope to see some of you out there for these events, both focused on Disaster Recovery and Business Continuity.

Next week is Storage Decisions Chicago.

Ripped Off? Interesting Resource

May 2nd, 2008

I had a credit card charge pop up on my statement today for something that I can’t remember buying.  I Googled the name of the vendor and found it on Rip-Off Report.  I hadn’t heard of this site before but it is pretty good.  I found many comments there from fellow consumers who had also been tagged with a credit card purchase from this vendor.

If you see what you think is a bad charge on your credit card, start your investigation here.

(I would like to see a similar site to provide consumers with a place to report storage vendors who have ripped them off.  Oh wait.  That’s DrunkenData.)

Do You Know Where Your Data Is?

May 2nd, 2008

What about your kids?  Or your car? Or…

Just got an email describing a USB device with a GPS tracker — that’s what I call persistent data.  You can find this thing through Google Earth.  (Just don’t rely on Google maps to give you driving directions to the location of the transponder.)

Word to Dave’s Gadgets:  I think this would be more useful if there was nothing on the outside of the USB key to identify it as a GPS tracker.  If it looked like every other now-ubiquitous USB storage device, the guys who steal your laptop, gym bag, child wouldn’t immediately throw it out of the case upon discovery.

DataCore Garnering Big Bucks

May 2nd, 2008

To all the analysts who counted DataCore Software out back in the late 1990s, it must have come as a big hot poker in the eye when the Ft. Lauderdale, FL company garnered a big investment last week from a couple of respectable venture caps.  George, Ziya and Bettye (who actually runs the place) are smiling like Cheshire cats.

DataCore has been scoring a lot of wins over the past couple of years, having mostly bought out its investors when the Dotcom bubble burst in 2001 timeframe.  They slimmed down their staff and headquarters space and kept their noses to the grindstone creating storage virtualization technology for FC, then iSCSI, then anything else you might have in play.  From our tests, their stuff works very well and pretty much always has.

I like these guys and wish them well with their R&D, which we will dutifully cover here as additional product features and functions are released.

I chuckle when I think of all the Pay-Per-View analysts sucking up to the once-dissed DataCore in the hopes of getting a piece of the latest funding.  (I hope Ziya gets to handle them himself, as he is such a diplomat.)

VTL: Another Questionnaire

May 2nd, 2008

Right after de-dupe, the second technology that analysts are claiming will make gazillionaires out of storage vendors in 2008 is the virtual tape library (VTL).  I want to hear from vendors and users about the appeal of this technology and understand some of the claims that the VTL guys are making about their wares.  Again, the ground rules are that responses to questionnaires will be posted here, without comment from me, to be referenced in a future article — either in ESJ.com or InfoStor or InformationWeek.

Here are questions for vendors:

1.  Please identify yourself, your company, and your VTL products. 

2.  VTL was originally conceived as a pre-staging area for writing to tape media.  You would buffer your backup jobs to disk until you had enough data to fill fully a high capacity media cartridge so you would not be wasting media.  (Some OS’s were notorious for writing a trickle of data to a cartridge, then returning the cartridge to the shelf of the library, so utilization efficiency was quite low.)  Now, we are being told that VTL does many things — most having nothing to do with backup stacking and tape media utilization improvement.  What problems do you see your VTL products addressing?

3.  Some vendors argue that VTL fixes the backup window issue (too little time to complete backups within operational windows) for users.  This implies that writing backups to disk is faster than writing backups to tape — which seems erroneous since streaming data to tape is much faster than copying data to disk.  Is this your claim?  If so, are you claiming that VTL writes are faster than tape writes?  Please explain.

4.  Related to the above, some vendors tout their ability to present a large number of “virtual” tape drives as the mechanism for expediting backup operations.  There are more “target” drives for backup writes in the VTL, hence more data can be written to the virtual library within a set period of time than can be written to a limited number of physical drives in an actual library.  Is this your claim?  If so, how many ports are physically available to handle writes to multiple virtual drive targets in your product?  If physical ports are limited, how can writes be faster?

5.  Some vendors argue that VTL provides a “Tier 2 repository” for data – that backing up data first to the VTL will provide a cache of files and datasets that can be restored quickly if an accidental deletion or corruption event occurs in the data stored on “Tier 1.”  Is this your claim?  If so,

a.  How is data, which has presumably been encapsulated into a backup stream, restorable without first un-encapsulating it from the backup file? 

b.  Is this (selective restore) a function of the backup software that the company uses, or of the VTL engine itself?

6.  Is de-duplication an integral part of your VTL?  If so, please describe the process by which data is deduplicated in your product (is it first written to the VTL then subjected to de-dupe or is it de-duped in route to the VTL repository?).

7.  Does your VTL index data or otherwise facilitate e-discovery?  Does it provide any capabilities for satisfying litigation hold requirements?

8.  Does your VTL require the purchase of branded hardware or is it software that can be used with any storage?

9.  Are operational windows for backup impacted by the failure of companies to define backup datasets properly?  In other words, are they backing up junk data with important data that really requires protection?  Can you estimate how much time your customers could cut out of their backup operations if they segregated the important data from the junk before running a backup?  Could doing so eliminate the need for a VTL?

10.  Are operational windows for backup impacted by the failure of backup tools to size jobs correctly?  Could companies complete backups within their operational windows if they divided the jobs by data size, network bandwidth availability, etc., grouping similiarly sized jobs together and running backups as a series of batch jobs as opposed to a single superstream that unravels as shorter jobs complete?  If efforts were made to configure backup jobs as batch operations (each batch including only targets with similar data volumes), could we prevent shoe shining and back hitching and work within operational windows without the need for a VTL?

11.  Does your VTL provide encryption services?  Does it do key management? 

12.  Please provide links to any materials available via the web that will describe in detail your VTL product, deployment options and theory of operation.

Thanks in advance for your responses.

Now, for consumers:

(You can elect to remain anonymous in your posts here, but we would appreciate an email letting us know that you are actually a user and not a vendor avatar.  At your request, your name and company affiliation will be kept in confidence.)

  1. Are you currently using a VTL or considering the use of a VTL?
  2. What problem(s) are you seeking to solve with VTL technology?
  3. What products are you considering and what do you consider to be the key selection criteria?
  4. If you have tested different products, which do you like (or not) and why?
  5. Is de-duplication a function you are seeking in a VTL?  Should it be integrated with the product or added as a separate piece of software/appliance/et al?
  6. Do you believe that a VTL will replace tape in your shop?
  7. Do you believe that backup writes to a VTL are faster than writes to a tape library?  Why?
  8. Do you segregate junk data from backup data sets before you perform a backup?
  9. Do you make any effort to define backup jobs as a collection of batch jobs or do you let your backup software define job size?
  10. Any other comments you wish to make about VTL?

Thanks in advance for your response.  Rather than posting responses directly, you can opt to email me your responses here.

Dark Storage

May 2nd, 2008

Monosphere will shortly be announcing a big upgrade.  They briefed me on it yesterday and I will likely write about it on ESJ.com for next week.  In the meantime, in keeping with International Data Awareness Month, I want to put forth a term that you will be hearing about in the press around the Monosphere Storage Horizon release next week.  The term is Dark Storage.

I am not sure whether Monosphere came up with this term, but I like it.  Dark Storage refers to storage that is unmapped, unclaimed or unassigned.  An example of how it happens from Monosphere’s explanation, “You configure two LUNs.  You give them to an applications guy.  They use one but not the other and forget that it is even out there.  File systems usage reporting ignores the second LUN, because it isn’t mapped, and makes misleading reports about capacity availability and usage.  The problem is made worse with a server virtualization layer.  That is dark storage.”

According to the company, between 15 and 40 percent of the capacity in the corporate storage infrastructures that they have inspected with their software can be characterized as dark storage.

Could you be sitting on capacity that you didn’t know you had?

By Decree: May is International Data Awareness Month

May 1st, 2008

It’s here.  May 1.  The beginning of International Data Awareness Month. And the celebration is just heating up.

Three years ago, in May 2005, we launched a blog called DrunkenData.com.  To our way of thinking, “Drunken Data” meant several things.

First, it was a reference to the mess that we all confront in our storage infrastructure:  our data is so un-managed as to appear “drunken.” 

This has profound implications for our ability to find anything, whether when doing work for the business or when seeking to comply with regulations on data preservation, retention, deletion, or protection. 

Metaphorically, our storage is a huge junk drawer — a fact that impacts business productivity, risk reduction initiatives, cost-containment strategy, green IT operations, and top line growth generally.

Second, Drunken Data referred to the paucity of actionable information about storage infrastructure.  In our view, SMI-S was a bust when it was conceived, held hostage to vendor machinations and their desire to lock in consumers and lock out competition by avoiding any sort of effective management scheme.  SRM guys continued to have to beg or buy access to storage gear APIs if they wanted to help consolidate management processes.  Not a lot has changed.

Pursuing this thread, we found that capacity oversubscription with underutilization was rampant. The way that storage was packaged, we had to buy a lot more than we needed.  And given the paucity of management tools, we used what we bought very inefficiently.

Third, Drunken Data was a reference to the prevarication and obfuscation that substitutes for good information to guide the choices made by storage decision makers. 

To quote a former VP at Bank of America, we expected the vendors to lie about their products: their lips moved after all.  Their claims too often strained credulity and referenced no objectively verifiable standards.  (They don’t seem to want any standards, anyway.)  Marketecture was being substituted for architecture and SNIA caught a lot of flack from this blog for being the grand mouthpiece for vendor bullshit.

We made fun of (and in some cases out and out attacked) analysts – first as the paid spokespersons of vendors who paid them, then as perpetrators of the biggest extortion racket since the 1930s mobsters and the 1980s junk bond peddlers.  We think the guys who created the current mess in sub-prime lending may have taken their cues from the storage analyst community.

We even published internal emails around trade press “best of class awards” to show how the silliness had become institutionalized bullshit.  (That one cost me a writing gig.)

Along the way, we took some vendor claims to task.  We showed the contradictions both in what a vendor was saying from one day to the next and in what the vendor was promising and what were the actual results.

Sometimes we had the feeling that all vendor marketeers belonged to a club — like a bizarre version of alcoholics anonymous in which everyone continues to drink, they just don’t tell each other their names.  No, wait.  That would be SNIA.

We were at times caustic in our rants, but mostly just sarcastic or satirical.  Going with the flow – whether to virtualization or thin provisioning or de-duplication or any other technological silver bullet du jour — was not our thing.  Maybe our mother held us too much or not enough when we were kids.  “Go along to get along” didn’t register with us and we are not on the Christmas card lists of too many three-letter vendors as a result.  We watched sadly as many good folks, burgeoning analysts, eventually knuckled under to economic pressures and began sucking up to the very vendors that they had originally sought to critique.  We don’t even allow advertising on this blog.

Surprisingly, even to us, traffic spiked.  We get between 120,000 and 180,000 visitors per month today.  We aren’t registered with Digg or Technorati or any of the other blogging aggregators.  We would never belong to a club that would have us as a member.  We just wanted to say our piece, hear what fellow users had to say, and try to do our part to set the record straight — or at least to ask questions aloud and in public.

There have been some technologies we have liked and embraced — mostly from smaller companies with big ideas.  Most of these have either been swallowed up by large vendors who have put them to sleep, or have been marginalized in what some might regard as a conspiracy of big vendor marketing and the complicity of consumers who have swallowed whole the marketecture of the three letter guys and become fanboys for life.  I for one never thought it desireable to tattoo a vendor’s brand on my body — not even Harley Davidson, but most certainly not EMC, IBM, Network Appliance, HDS and the rest.

We have also seen the rise of the vendor blogger — an interesting tactic of the vendor community, we suppose, to fight fire with fire.  Some of these blogs, usually the ones by CTOs and engineers, provide insights into the thinking that has gone into a product.  Mostly, however, vendor blogs have become founts of disinformation about the blogger’s own products and his competitor’s wares, a springboard for attacking those users and commenters who have unfavorable views of the company’s products or services, and in general an effort to put lipstick on a pig in the face of a growing open exchange of information.  China has nothing on these guys from the standpoint of bending the truth to fit an agenda.

It’s enough to drive you to drink.

Anyway, we’re three years old today and still plodding along.  Despite assertions that we have “broken faith” with the vendor community and should not expect any cooperation from them in the periodic questionnaires we publish here, there seem to be no shortage of companies lining up with responses.  That some three-letter guys do not avail themselves of this opportunity to extol the virtues of their “solutions” actually seems to work against them in the long run:  they are notable by their absence.

There is a lot of new development coming down the pike.  We are retrofitting our existing “web empire” with new technologies that will improve their value to the consumer and we are adding new forums.  Stay tuned.

And thanks for reading!

 

 

 

birthday_cupcake_3_md_wht.gif
   

 

ESJ Publishes My Thoughts on IBM/Diligent, Plus De-Dupe Questionnaire Getting Some Responses

April 30th, 2008

ESJ.com has just published my column on the acquisition of Diligent by IBM and what it means in the de-dupe market.  Have a read if you want.

Meanwhile, responses are coming in to my questionnaire on de-dupe.  Bill Andrews of ExaGrid and Larry Freeman of NetApp were first across the transom.  Yesterday, a fellow from DataDomain said his company’s response would be forthcoming shortly.  Looks like the dialog is beginning.

I would like to see all of the de-dupers get on board with this.  I think some of the responses received already throw down the gauntlet and begin to expose issues that the vendors are seeking to leverage as differentiators.

Quantum’s lawsuits alledging de-dupe patent infringement don’t seem to be having the chilling effect that I feared.