May 16, 2008 (12:05 PM EDT)
LAN And SAN Unite
Read the Original Article at InformationWeek
Convergence is about more than just voice and data. Storage and networking vendors promise that the next-generation network will unite local and storage area networks, virtualizing both over a single network at 20 Gbps, 40 Gbps, or even 100 Gbps. And it isn't just the networks that are coming together. The future SAN also involves convergence of storage with memory. The same principles that originally abstracted storage out of a local server can be applied to RAM, too, while storage targets shift from hard disks to flash memory.
Longtime SAN users might feel a sense of déjà vu. After all, SAN and LAN convergence was an early claim of iSCSI, which promised to make Fibre Channel unnecessary by routing storage traffic over IP networks. While iSCSI is a viable option, it's mostly used in low-end networks that need speeds of 1 Gbps or less. And even here, many users prefer to keep the SAN separate from the LAN, even if both are built from the same commodity Gigabit Ethernet switches.
The new push toward unified networks differs from iSCSI in both its ambition and the resources behind it. Vendors from across the networking and storage industries are collaborating on new standards that aim to unite multiple networks into one. Startups Xsigo and 3Leaf Systems already have shipped proprietary hardware and software aimed at virtual I/O that can converge SAN with LAN, though not yet memory.
This industry isn't talking consolidation just for consolidation's sake. The catalyst is server virtualization, for which either a SAN or network-attached storage is critical as a way to separate workloads from data. Without virtual storage, the flexibility inherent in virtualization is reduced as data must be moved or replicated whenever a virtual machine is set up or torn down. Decoupling storage from processing makes it easier for workloads to be moved around.
The big disagreement is about what this single network will be. This time around, no one is talking about routing everything over IP; the low-latency and low-overhead requirements of data center networks dictate that the only realistic choice is to use a fully switched fabric, meaning the converged network will be confined to the data center. But will it run over Ethernet, Fibre Channel, or InfiniBand?
Virtualization also brings a greater urgency to network consolidation. A single nonvirtualized server would usually have only two network connections: a network interface card for the LAN and a host bus adapter for the SAN. In a server running multiple VMs, each virtual server needs its own NIC and HBA, which will quickly become unmanageable unless they're virtual. And if the networks need to be virtualized anyway, aggregating them together over a single high-performance transport will make the data center more flexible.
Storage is just the first step. The SAN has been so successful that companies from Cisco Systems to Intel-backed startup 3Leaf see remote memory as the next step, doing to RAM chips what the SAN did for disk drives. "Most servers have enough compute power, but they're constrained by the number of memory slots available," 3Leaf CEO B.V. Jagadeesh says.
How real is this? 3Leaf says its chip is in the test phase, with a version for AMD due to ship before the end of the year and one for Intel a year later. (As with multiple CPUs on the same server, it won't be possible to mix AMD and Intel processors together, as each uses a different proprietary bus.) 3Leaf already ships a box that can virtualize Fibre Channel SANs and Ethernet LANs, as does competitor Xsigo, which is backed by Juniper.
3Leaf calls its box an I/O Server, reflecting that it's built from standard PC components and could be licensed by server vendors; Xsigo calls its product an I/O Director to emphasize that it's a SAN appliance. But architecturally, both work in the same way. Virtual NICs and HBAs within each server (or each VM) tunnel Ethernet and Fibre Channel traffic to the box, where it's linked to a SAN and LAN via physical Ethernet and Fibre Channel. These aren't conventional virtual LANs, as the boxes don't terminate connections or have MAC addresses. As far as switches on the SAN and LAN are concerned, traffic goes straight to the virtual adapter.
(click image for larger view)
At present, both the Xsigo and 3Leaf appliances require InfiniBand for their physical connections to servers. It's an obvious choice: Designed in part for virtual I/O, InfiniBand offers very low overhead and network latency of less than 100 nanoseconds--comparable to that of a PC's local memory bus. Like Fibre Channel, it can scale to 20 Gbps, which is again comparable to the bandwidth of the AMD and Intel chip interconnects.
InfiniBand is also cheaper than alternatives: A 10-Gbps InfiniBand host channel adapter (HCA) costs around $700, compared with at least $1,000 for an Ethernet NIC and more than $2,000 for a Fibre Channel HBA. Switch port prices show similar variation, though some users may not initially need a switch on the server side of the box. While switches are necessary for 3Leaf's planned memory networks, Xsigo's I/O Director has add-on InfiniBand modules that let it connect directly to servers.
Despite InfiniBand's advantages, most in the industry see Ethernet as the long-term future for a single, converged transport. Both Xsigo and 3Leaf plan to support it eventually, as do many larger players. In February, Cisco launched the Nexus 7000, a giant data center switch aimed at consolidating multiple networks into one. Unlike the startups, Cisco isn't even bothering to support InfiniBand, though it says it may add InfiniBand modules if there's enough customer demand.
The most compelling argument for moving to Ethernet is that everyone has it anyway. The persistent trend in networking has been toward increasing dominance of Ethernet over other technologies. Though some users are replacing physical cables with Wi-Fi, that's really an extension of Ethernet, not a replacement for it. What started out as a way of linking PCs together has become a universal connection to the Internet, increasingly used for voice as well as data, and even for WAN services in addition to the LAN.Several standards and initiatives, collectively known as Data Center Ethernet, aim to improve Ethernet's latency, giving it characteristics similar to Fibre Channel and InfiniBand. There are also plans to increase Ethernet's speed beyond 10 Gbps, though its usual tenfold speed boost probably isn't realistic in the short term.
"It'll be double-digit years before we see 100-Gig Ethernet on the market," says Koby Segal, COO at InfiniBand vendor Voltaire. "It's not just switches, but the whole ecosystem of the cables, connectors, and backplanes."
What It All MeansSERVER VIRTUALIZATION means that storage networks will be more critical than ever: Virtual servers need virtual storage.
NETWORK CONSOLIDATION is the end, I/O virtualization is the means. Uniting SAN and LAN into a single fabric can pay big dividends.
MEMORY NETWORKS are the next step after storage networks, but few apps will really need them for the foreseeable future.
INFINIBAND is currently the only realistic transport for a converged network that unites memory, storage, and Internet traffic, but that will change within a year or two.
100-GBPS ETHERNET is the long-term future, but it could still be a decade or more away. Waiting for it means being left behind.
The technical challenges of reaching 100 Gbps are formidable. For example, the current spec calls for a bit error rate of no more than one in 1012, which means getting about 1 bit wrong for every 125 GB transmitted. That's acceptable over a low-speed link, but at 100 Gbps it would mean making an error on average every 10 seconds--something that would cause serious problems at the application layer, leading to delays and congestion as dropped packets are retransmitted. Vendors hope to reach a rate of one in 1015, which would put nearly three hours between errors on average. But like the 100-Gbps speed itself, this is an aspiration, not a guarantee.
Of course, the same challenges apply when trying to scale any networking technology, and it's no coincidence that 40 Gbps is also the next speed ramp for InfiniBand. The technologies are so similar that Mellanox sells a 10-Gbps InfiniBand HCA that can also act as an Ethernet NIC. At 40 Gbps, both can reuse much technology originally developed for OC-768 WAN links, which run at almost the same speed. There's nothing comparably popular at 100 Gbps.
Of the Data Center Ethernet initiatives, the most important to storage looks like Fibre Channel over Ethernet. Expected to be ready by next year, FCoE is supported by most in the industry, including IBM, Cisco, QLogic, and Brocade, with many vendors already demonstrating proprietary implementations. The intention is that Ethernet switches will be able to speak Fibre Channel, or FC, natively, further blurring the distinction between LAN and SAN.
READY OR NOT
In the very long term, FCoE aims to make converting between Ethernet and FC unnecessary. While the higher-level FC protocol would be retained for its reliability and compatibility with existing storage apps, Ethernet could be used for all storage links. However, that's unlikely any time soon. Storage targets tend to have a slower replacement cycle than network switches, and there's less need to consolidate their links, so most vendors initially plan to use FCoE only for linking to servers. Then FCoE must be converted to physical Fibre Channel, using either a separate appliance or FC modules in an Ethernet switch.
But even when FCoE is ready, Ethernet itself might not be. According to tests conducted by Neterion, the actual performance of many 10-Gbps Ethernet links drops to an average of 4 Gbps once VMware is installed, thanks to the overhead of virtualization. Naturally, the vendor offers its own technology to fix this: extra smarts on Ethernet NICs that are similar in concept to the hardware-assisted virtualization that Intel and AMD have built into CPUs. Until something similar is standard across high-speed Ethernet, people expecting to get full use out of a 10-Gbps pipe may be disappointed.
With InfiniBand also looking to virtualize Fibre Channel, physical FC links to servers look like they may not have much of a future. Still, FC vendors aren't just going to abandon the technology. Though Brocade says it will support FCoE, it's already shipping a Fibre Channel Director intended for network consolidation. "We aim to bring the characteristics of the SAN to other networks within the data center," says Doug Ingraham, senior director of product management at Brocade. His vision is similar to that of Cisco, only it starts with FC instead of Ethernet. And like Cisco, Brocade sees no need to support InfiniBand.
Though storage targets don't have as fast a replacement cycle as virtual servers, they could be in for even greater change once they do. Almost all are based on hard disks, a technology that's already reached its speed limit. Flash memory capacity has grown to the point where it offers a seductive alternative, leading EMC to announce the first high-end storage array based on flash this past January. Its competitors are likely to follow, but flash has important issues that need to be overcome.
(click image for larger view)
The problem with hard disks isn't with data capacities or bus transfer rates; both have increased exponentially while costs have gone down, a trend that's likely to continue. It's with physically moving the drive's heads across the disk surface, a mechanical process that needs to be repeated every time a new file is read or written. The fastest drives take about 4 milliseconds to do this, meaning that they're limited to about 250 operations per second. Moore's Law doesn't apply to moving parts like actuators and drive heads, so this number isn't likely to increase.
Meanwhile, servers are pounded by an exponentially growing number of requests. Rich Internet applications and always-on remote connections make the problem worse, as clients request data in small, frequent chunks that each require the drive to move. Conventional solutions--massive overprovisioning or caching data in RAM--are expensive and power-hungry. Though dynamic RAM uses less power than constantly spinning hard disks of the same capacity, the catastrophic data loss in the event of an outage means that it can't be used alone. Volatile DRAM can only augment hard drives; it can't replace them.
Flash memory doesn't suffer from the same problems as disk drives: It has no moving parts, so the number of operations per second is measured in the thousands. And compared with both hard drives and dynamic RAM, it requires very little power. Its weakness is that it wears out even faster than hard drives: A flash drive can only be rewritten at most 100,000 times before it dies--not an issue for consumer electronics or laptops that only access the disk occasionally, but a serious problem for servers whose storage is in almost constant use.
"Flash has limitations, very much so," admits Amyl Ahola, CEO of Pliant Technology, a startup aiming to make flash memory competitive with hard disks in storage targets. "At the enterprise level, it's a nontrivial task to put an architecture together to get lifetimes that are at least as long as a hard drive."
As yet, there's no way to make a flash drive with infinite rewritability. STEC, whose 73-GB and 146-GB flash drives are used in the initial EMC array, gets around the limitation by using a technique called wear leveling. This distributes write operations evenly over an entire disk surface, ensuring that all parts of the chip will wear out at the same time. It estimates that this extends the lifetime to about 2 million write cycles.
Pliant has extra error-correction techniques that it says will extend the life of flash drives even further, building an ASIC that will detect and avoid parts of a flash chip that are about to fail. Its products are still in the lab, expected to ship by the end of the year through major storage OEMs. The big downside is price: Pliant estimates that drives will cost around $20 to $30 per gigabyte. For hard disk prices, use the same numbers but change the dollars to cents.
Still, flash memory prices are falling all the time, and the technology could pay for itself. Pliant estimates flash memory eventually will cost around the same as hard drives, once the overprovisioning necessary for high-performance applications is taken into account. The lower numbers of drives needed for flash also translates into savings in space and maintenance, as well as much lower power consumption.
Even if a migration from hard drives to flash memory eventually reduces the size and number of storage targets, there's little sign that they'll be moving back inside servers. The flexibility inherent in virtualization means that storage resources increasingly will be abstracted away from applications, which won't know or care whether their data is stored locally, on a hard disk, or in flash memory. Virtual servers will need virtual storage, with a virtual network in between.
Illustration by Christoph Neiman