back to article IT admins hate this one trick: 'Having something look like it’s on storage, when it is not'

An argument about how to solve the same technical problem has sprung up between two rival startups with plenty of reason to say the other's tech is not up to scratch. But they raise some interesting issues about how to solve slow access to moved files, where to store metadata, and more. How best to archive files yet preserve …

  1. Aitor 1

    Run in the background

    They claim not to disrupt because they run on the background.

    This sounds all good and well, but fails to be a thing. If they make requests to the NAS, they will impact performance.

    Now, they could be doing that while there is low load.. but that means that the NAS is underused... so they will have to be very tightly integrated with the NAS in order to prevent degradation... and if they have continous deduplication and backup, it starts being very complicated.

    1. komprise

      Re: Run in the background

      Aitor 1, Komprise uses the spare cycles on the NAS and typically run under 3.5% of the load on the NAS because of its adaptive architecture. We are in several multi-petabyte environments with nearly hundred thousand file shares across multiple servers and hetereogeneous NAS environments and customers have never had to set QOS policies for Komprise. Most environments have some spare cycles but finding them manually and managing them is hard. Running non-distruptively in the background takes advantage of the spare cycles without human intervention.

  2. Mark 110

    Great article - thanks

    I learnt lots. Really good piece.

    Komprise seems like a really good product. Their argument about why would you buy a really good hot storage platform just to stick a point of failure in front of it makes a lot of sense.

    1. Anonymous Coward
      Anonymous Coward

      Re: Great article - thanks

      Not only a potential point of failure but like storage virtualization appliances it becomes a lock-in, they own the keys and the only path to accessing your data meaning it can be very difficult to extricate yourself later.

      Maybe not a great position to be in, especially if relying on a startup.

      1. Peter2 Silver badge

        Re: Great article - thanks

        Forgive the innocently ignorant question (I've been working at SME scale and so haven't dealt with enterprise scale problems for a decade) but can't you just restore from backups to move your data elseware if push comes to shove?

        1. Anonymous Coward
          Anonymous Coward

          Re: Great article - thanks

          The point with these appliances is you don't need to do the restore. You just access your data as you would normally. Restoring from backup takes time, especially if you don't have control over the process and need to raise a ticket etc.

          I assume that's what you were getting at.....

          1. Peter2 Silver badge

            Re: Great article - thanks

            Your saying that your locked in, leaving the vendors as being the only people able to access your data, making it difficult for you to extricate yourself later (ie, replacing the vendors equipment?)

            I can't see why, as you should be able to simply buy adequate storage space from elseware and then restore your backups to the new storage? Or do you either let them control your backups or not do backups?

            1. Naselus

              Re: Great article - thanks

              "Or do you either let them control your backups or not do backups?"

              Backups on some storage equipment are done in proprietary formats (usually with fiendishly clever checksum maths), and so you can't just restore onto a different device. Not such a problem with Netapp, who aren't likely to go belly up anytime soon, but with some of the smaller storage appliance players it could be very risky, especially since the margins are razor thin.

        2. Wayland

          Re: Great article - thanks

          Peter2 that's a good point. When you say "can't you just restore from backup" you are talking about the sysadmin (your role) not the user. Yes I would expect you can just dump everything to another storage. It would take a while. The problem would be if you wanted to do so because something broke and you were fed up with your current system and supplier. Your new supplier may well not have the tech to help.

      2. komprise

        Re: Great article - thanks

        You bring up an excellent point of why customers shy away from storage virtualization or network virtualization solutions that front all the data. Komprise creates no lock-in - all the metadata is stored at the sources or the targets, and Komprise itself is mostly stateless. It can be removed from the picture at any point and all you have lost is some aggregate analytics.

  3. Pascal Monett Silver badge

    "any time you rely on humans/users to do something it never works"

    On that point, I have to agree.

    I cannot count the number of companies I worked in or consulted to who struggle to get users to archive mail. The mailbox is the number one critical application in many companies, and network disk sizes are always on the verge of overflowing.

    In order to get things under control, often the only solution is to arbitrarily impose a cutoff date and archive anything older than that. Then you get grumbling in the ranks, although curiously most of the time the impact on actual work is minimal.

    1. Anonymous Coward
      Anonymous Coward

      Re: "any time you rely on humans/users to do something it never works"

      I've bitched about this one before. I used to work for IBM and with their Lotus Notes crapware they set quotas, which were invariably too small. Yet they seemed to actively encourage employees to have email signatures full of pointless graphics which were often several MB in size. The application offered no means of stripping out these graphics unless you edited each email individually and re-saved, then re-compressed the mailbox.

      It only took a few email chains with some knobhead with one of these signatures regularly contributing to it to cause you to go over your quota.

  4. l8gravely

    Been there, done this for both styles of migration. They all suck

    I manage enterprise NFS storage and over the past 15+ years I've worked with various tools and solutions to the problem of moving old files from expensive storage to cheaper storage, either disk or tape. And it all sucks sucks sucks.

    If you do backups to Tape, then suddenly you need to have a tight integration with the storage and backup vendors, otherwise restores (and people only care about restores!) become horribly horribly painful to do.

    We tested out a product from Acopia, which had an appliance that sat in front of our NFS storage. It would automatically move data between backend storage tiers automatically. Really really slick and quite neat. Then when we implemented it we ran into problems. 1. With too many files, it just fell over. Ooops. We had single volumes with 10Tb of data and 30 million files. Yeah, my users don't clean up for shit. 2. Backups sucked. If you backed up through the Acopia box, then you were bottlenecked there. If you backed up directly from the NFS storage (either NFS mounts or NDMP) then your data was scattered across multiple volumes/tapes. Restores, especially single file restores were a nightmare.

    We then tried out a system where our backup vendor (CommVault) hooked into the NFS Netapp filer and used the fpolicy to stub out files that were archived to cheaper disk and then to tape. This worked... backups were consistent and easy to restore since CommVault handled all the issues for us.

    But god help you if a file got stubbed out and then the link between got broken, or you ran into Vendor bugs on either end. It also turned into a nightmare, though less of one I admit. But it also didn't scale well, and took up a ton of processing and babying from the admins.

    I'd really like to get a filesystem (POSIX compliant please!) that could do data placement on a per-directory basis to different backend block storage volumes. So you could have some fast storage which everything gets written to by default, and then slower storage for long term archiving.

    Again, the key is to make restores *trivial* and *painless*. Otherwise it's not worth the hassle. And it needs to be transparent to the end users, without random stubs or broken links showing up. Of course, being able to do NDMP from the block storage to tape would be awesome, but hard to do.

    Replication, snapshots, cloning. All great things to have. Hard to do well. It's not simple.

    Just scanning a 10tb volume with 30 million files takes time, the metadata is huge. I use 'duc' ( https://github.com/zevv/duc) to build up reports that my users can use via a web page to find directory trees with large file sizes across multiple volumes, so they can target their cleanups. And so I can yell at them as well with data to back me up.

    Running 'du -sk | sort -n' on large directory trees is just painful. Again, having a filesystem which could keep this data reasonably upto date for quick and easy queries would be wonderful as well. No need to keep it completely consistent. Do updates in the background, or even drop them if the system is busy. That's what late night scanning during quiet times are for.

    It's not a simple problem, and it's nice to see various people trying to solve the issue, but ... been there. Tried it. Gave up. It's just too painful.

    1. komprise

      Re: Been there, done this for both styles of migration. They all suck

      l8gravely,

      You concisely summarize a lot of the issues this industry faced, but a large part of the problem is that the architectures you describe were built 15+ years ago when a terabyte sounded big and client-server designs were prevalent and there has been little to no innovation for over a decade in this area. I would encourage you to look at solutions one more time as things have vastly improved - Komprise for instance does many of the things you yourself highlight as solutions in your comment, and solves the other issues you ran into with scalability:

      i) Your comment: Do updates in the background, or even drop them if the system is busy. That's what late night scanning during quiet times are for.

      This is exactly what Komprise does - it adaptively analyzes and manages data in the background without getting in front of the metadata or data paths. It throttles down as needed to be non-intrusive.

      ii)Your comment: the key is to make restores *trivial* and *painless*. Otherwise it's not worth the hassle. And it needs to be transparent to the end users, without random stubs or broken links showing up.

      Our point exactly. This is why we don't do what you said some of the prior solutions did (e.g. use proprietary interfaces to the storage systems, or use stubs that are proprietary and create a single point of failure). Instead we use standard file system protocols and our links are dynamic so they are not a single point of failure. With us, your data can move multiple times and the link does not change. The link can be deleted by a user accidentally and we can replace it without losing any state. You can move from one file system to another without having to port over stubs or links. There are no databases to manage.

      iii) You mentioned traditional solutions did not scale well - either with stub access or scanning.

      The traditional architecture is client-server - a central database holds the metadata, and this becomes a scaling bottleneck. As you point out, today data is routinely millions to billions of files scattered across different shares. This is why a distributed architecture is required to manage today's scale of data. Komprise uses a lightweight virtual machine based distributed architecture where you can simply add Komprise Observer virtual machines into an environment and they dynamically rebalance and distribute the analysis and data management workload across them. Just as Google search scales horizontally, Komprise scales without a central database, or any in-memory state that is costly to manage and recreate, and without a master-slave client-server architecture. This approach allows us to scale to hundreds of petabytes and more of data seamlessly and POC to production simply involves adding some more Observers.

    2. kjbarth

      Re: Been there, done this for both styles of migration. They all suck

      Have you looked into the VERITAS storage management products? -- they have a very fast and fully POSIX file system as well as comprehensive support for replication, snapshots, cloning, policy based storage tiering that is flexible to the directory or file level and is transparent and automatic, based on your settings. The product is vendor agnostic, working on all major UNIX/Linux and Windows server environments and all major DAS, NAS, and cloud suppliers.

      Of course they have a web-based GUI to make management and visibility of all your data straight forward. They also provide direct command-line access to all operations that the GUI can perform so that you can easily script workflows.

      I am not 100% sure this fits your requirements, but I was just curious if you had looked into their offerings and, if so, I would be interested to hear your conclusions.

    3. Anonymous Coward
      Anonymous Coward

      Re: Been there, done this for both styles of migration. They all suck

      Spectrum Scale (GPFS) can do multi-tier storage with integrated backup to Spectrum Protect(TSM).

      If you have billions of files and need to scan them quickly, low level metadata scans can provide the speed that OS tools just cannot. The metadata should be stored on SSD/Flash of course, but data could be on SSD, SAS, NLSAS, or Object store like Cloud Object Store(Cleversafe)

  5. Wayland

    Both are good but different

    From reading this I prefer Komprise since they don't mess with your storage system except with the old unused files are moved to the capacity storage leaving a link behind. I expect if you don't want a file moved you can tag it so it stays on primary storage. I expect you could easily add this to an existing live computer system to free up some space.

    Infinite-IO sounds like a much more sophisticated system with greater benefits and greater risks. All storage is behind their system so would be much more disruptive to set up on an existing computer system. If building a new one from scratch then it would be a pretty good thing. But as Krishna indicates if you lose a bit of it you can't just read the files off the drive, you need that propitiatory system running to get access to your data.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon