back to article Hey blockheads, is an NVMe over fabrics array a SAN?

What is a SAN and is an NVMe over Fabrics (NVMeF) array a SAN or not? Datrium says not. E8 says... maybe. A Storage Area Network (SAN) is, according to Wikipedia, "a network which provides access to consolidated, block level data storage... A SAN does not provide file abstraction, only block-level operations. However, file …

  1. CheesyTheClown

    Who cares?

    NVMe is simply a standard method of accessing block devices over a PCIe fabric. As with SCSI, it's thoroughly unsuited for transmissions over distance. It generally adopted many of the worst features of SCSI at least with regards to fabric. There is nothing particularly wrong with using memory mapped I/O as a computer interconnect, in fact, it's amazing all the way up to when you try to run it across multiple data centers. At that time, NUMA style technologies no matter how good they are basically fall apart. There's also the issue that access control simply is not suited for ASIC implementation, so employing NVMe and then adding zoning breaks absolutely everything in NVMe. Shared memory systems are about logical mapping of memory of shared memory. It is horrifying for storage.

    So, in 2017, when we know that any VM will have to perform software level block translation and encapsulation at some point, why the hell are we trying to simulate NVMe which just lacks any form of intelligence when we should be focused on reducing our dependence on simulated block storage by more accurately mapping file I/O operations across the network.

    BTW, the instant we added deduplication and compression, block storage over fabric became truly stupid.

  2. bldrco

    Read the spec...

    Sounds like this company didn't read, particularly the sections on subsystems, shared namespaces and reservations.

    1. Anonymous Coward
      Anonymous Coward

      Re: Read the spec...

      That's not the point. The point is that some software running somewhere needs to provide erasure coding across devices and data protection services. They are saying those services need to run on the server.

  3. Anonymous Coward
    Anonymous Coward

    better title - company with product looking for a market desperate to get recognition

    storage startups have it tough right now, not really providing much in the way of differentiation that would allow them to hit escape velocity, continual lower prices of flash disk, and a glut of storage product offerings existing from software defined, to purpose built appliances, and a buying class that simply has too many choices from the existing vendors to risk much of anything with a startup all contributing to the end of the road for many of them.

    If you are coming to the market with a box, you should just stop wasting everyones time now. The worlds moving on. Still a lot of walking dead out there, some not even really walking, just kind of laying there. None of these companies will exist in 3 years:

    Tegile: ZFS in a box was never a thing

    Tintri: IPO as a down round, yeah good luck with that

    Datrium: if EMC couldn't make Flash/Thunder work, why do you think you can?

    Coho: firing all your staff is the true path to riches

    E8: EMC couldn't sell DSSD, you cant either and speed without control is a race to the bottom in cost

    Infinidat: only so much legacy XIV you can take out before the well runs dry, IBM's done writing checks

    Violin: bk and your IP sold for what 1.2M? just stop.

    1. andreleibovici

      Re: better title - company with product looking for a market desperate to get recognition

      ** Andre from Datrium here **

      "Datrium: if EMC couldn't make Flash/Thunder work, why do you think you can?"

      The same was said about HCI back in 2011. Why do you think startups and born, funded, and then either acquired by behemoths or go through IPO and become large companies?

      The reasons are simple, focus and execution that big vendors don't have, and more importantly no dependencies, portfolio, and market to defend.

      We have successfully created a completely disaggregated platform with all necessary enterprise data services that provides 5x lower latency than AllFlash and is 10x times more scalable than HCI.

      Most importantly, customers love it, and Datrium is growing faster than all prior unicorns in the infrastructure space.

      Time will tell if we can keep executing. I'm biased of course, but based on history I would bet on Datrium will be one of the big private cloud vendors in few years, with an innovative platform that is similar to what Facebook and other hyperscalers are using nowadays. The team is made of Data Domain founders (acquired by EMC) and the very first engineers from VMware (pre-GSX launch).

      Sometimes you have to see to believe!

  4. Anonymous Coward
    Anonymous Coward

    NMVeF does not require RDMA

    "With NVMeF there is a remote direct memory access from a host server to a drive in an NVMeF array"

    I think someone probably needs to read the specs for NVMeF - multiple types of fabrics are supported and RDMA is not a pre-req (for Fibre Channel for instance). Just because Datrium only support RDMA does not mean it's the only way.

    RDMA might make sense for HPC but does not make sense for Enterprise Storage. What NVMeF is trying to fix is a different problem than inter-server communication - sharing mission critical data across an extended network. A low latency fabric (SAN) will allow sharing of data with minimal overhead, and huge reduction in latency compared to regular flash, whilst also massively decreasing server IO overhead (due to the simplicity of the NMVe stack compared to the SCSI one).

    1. Yaron Haviv

      Re: NMVeF does not require RDMA

      So if its over say TCP or FC, what makes it different than iSCSI or FCP? Just adding another protocol to the block?

      I also fail to see the big value of NVMeOF, EMC Thunder used ISCSI RDMA (iSER) with few M IOPs 1:1 server:client 5 years ago, and at least had the iscsi control layer to deal with discovery, failure, multipath..

      we dont need more blocks, but move up the storage stacks to provide files, objects and databases over native flash + fabric. Is there a single new database today which isn't distributed? If so why do we need to distribute the blocks as well? Just to keep fabrics busy?

      Yaron

  5. J_Metz

    No VMware?

    Datrium's claim of not having VMware support will be news to VMware, considering they've had support and drivers for NVMe for at least two releases (http://www.nvmexpress.org/vmware/).

    The question itself about whether NVMe-oF (the correct abbreviation, by the way) is a SAN is like asking whether or not a shipping container is a ship or not. NVMe (and NVMe-oF) are protocols, and a SAN is an implementation of those protocols.

    Fibre Channel is a prime example of how the protocol itself can be configured as a SAN, as a Point-to-Point, or as an Arbitrated Loop. NVMe over Fabrics falls in that same realm of "implementation strategies" which *could* be a SAN, or it *could* be a DAS device, etc.

    This becomes incredibly important when weighing and measuring the pros and cons of the protocol (NVMe-oF) over the transport mechanism (RDMA-based) versus the architecture type (SAN, NAS, DAS, etc.).

    Disclosure: I am on the Boards of Directors for NVM Express, SNIA, and FCIA, but speak only for myself.

    1. Hugo@Datrium

      Re: No VMware?

      Let me clarify a couple of points.

      1. Datrium customers have been using NVMe with VMware in production for a year and half, so I certainly agree that it can be a great combination. I look forward to future ESX enhancements for NVMe like hot-swap support.

      2. NVMf is a possible way to connect a host to a traditional array. Such arrays are often informally referred to as SANs even though, technically, a SAN is the Storage Area Network used to connect hosts to arrays, not the arrays themselves. If one does use NVMf to connect to a traditional array, its controller stands between the host and the NVMe storage device and it will add latency and queuing delays that will dilute the low-latency benefit of NVMe SSDs. (By the way, though NVMe-oF is the trademarked abbreviation, NVMf has 3x the google hits and it’s shorter).

      3. My comments were referring to the recent crop of products that offer NVMf connectivity to individual NVMe SSDs without an array controller in between. It is possible for multiple hosts to share data stored in such an NVMf-connected SSD. The problem is that in this set-up there is no device resiliency. If that individual SSD fails, the data on it is lost. Enterprise-grade resiliency is usually provided by RAID implemented in the array controller. Without the array controller, the resiliency must be implemented elsewhere. Conventional volume manager software on the host can treat the NVMf-connected SSDs as if they are local, direct-attached devices and provide RAID resiliency. But, data sharing with resiliency would require a distributed volume manager with RAID or erasure coding to coordinate updates across hosts. I’m not aware of such a distributed volume manager with erasure coding efficiencies in widespread use today. So, VMFS could perhaps run on shared individual devices, but there would be no device fault tolerance, so few would consider this a viable solution.

      4. Datrium has been in production since launch in early 2016 with a server-powered, host-side, distributed stack that provides efficient device resiliency through erasure coding and enables data sharing among the hosts. Its server-powered file system (not available as just a volume manager), accessed via NFS, provides other nice features like compression, deduplication, encryption, and VM data protection through VM-level snapshots and replication.

      1. J_Metz

        Re: No VMware?

        Upvoted for the great response.

        I agree with many of the points here, but I do wish to caution Hugo@Datrium on a couple of sleight of hand maneuvers. I would suggest going back and reading the first two paragraphs of the article:

        'A Storage Area Network (SAN) is, according to Wikipedia, "a network which provides access to consolidated, block level data storage... A SAN does not provide file abstraction, only block-level operations. However, file systems built on top of SANs do provide file-level access, and are known as shared-disk file systems."

        'Techopedia says: "A storage area network (SAN) is a secure high-speed data transfer network that provides access to consolidated block-level storage. An SAN makes a network of storage devices accessible to multiple servers."'

        Discussing an Array capability in terms of the benefits that a SAN offers is misleading. Conversely, *criticizing* an array because it lacks those SAN features is disingenuous. What's worse, the fact that there are implementation differences within and without the protocol standards means that vendors can (and do) use the protocol as they require to achieve the results they desire.

        For instance, let us take your point (2) above. "If one does not use NVMf [sic] to connect to a traditional array, it's controller standsa between the host and the NVMe storage device." It seems to me that if you are *not* using NVMe-oF to connect to an array, the question of whether or not NVMe-oF "is a SAN" is moot. There is no definition of SAN (of which I am aware) that requires an array storage controller.

        [Which brings up another point: Part of the problem here is that the word "controller" is used so many times that it becomes difficult to keep these items separated. There is a controller for the FTL, there is a controller for the NVMe subsystem, there can be a controller for the storage array, etc. In SDS implementations, there is a controller for the control plane too. Controllers, controllers everywhere! :)]

        Resiliency in NVMe-oF systems is indeed a major question, and I completely agree with most of the caveats Hugo mentions in (3). Once again, though, we have a "shifting goalposts" argument. "I'm not aware of such a

        - distributed volume manager

        - with erasure coding efficiencies

        - in widespread use today."

        I break this down specifically like this to illustrate that at any point Hugo can point to that last prepositional phrase, "in widespread use today," as the clincher. Then again, with the nascent timeline of development for NVMe-oF drivers to begin with, claiming *any* NVMe-oF deployment as "widespread use" would be at best, wild-eyed optimism.

        Nevertheless, the question at hand is one of the technology, not one of the deployment status. I would argue that at the core, the implementation of NVMe-oF as a SAN is indeed possible (technically speaking), and that such features and benefits of SAN deployments can be transferred over to NVMe-oF - even if it happens to be an eventual achievement.

        One note on NVMf vs. NVMe-oF. I agree that NVMf is shorter and while I prefer the shortened method myself, I've come to learn that many companies using the non-standard nomenclature also have non-standardized NVMe-oF implementations. It's becoming a very quick litmus test to determine whether or not the implementation is the standard version or not. So much so, that I automatically make the (often correct presumption) that companies using NVMf in their literature are *not* using standards-based NVMe-oF. Again, it's my perception having worked with and talked to many NVMe-based companies.

  6. Anonymous Job-holder

    Correction to Metz' comment?

    Thanks to a very informative discussion- especially Metz and Hugo. But...

    Metz quoted Hugo as having written: "If one does not use NVMf [sic] to connect to a traditional array...'

    But that isn't what Hugo wrote. He wrote: 'If one does use NVMf to connect to a traditional array....'

    Presumably Metz misread Hugo, and saw a NOT when there was NOT a NOT present.

    Just trying to help. Thanks again to Metz and Hugo.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon