Are your NFS filers processing ten million files? Bottlenecks? Our P2P cloud bursting software can help, says IC Manage – Blocks and Files

0
72

analysis: For 17 years, a small Californian company called IC Manage worked quietly to build a business to meet the unique needs of integrated circuit (IC) and electronic design automation (EDA) providers. For one of its co-founders, Dean Drako, IC Manage was just a sideline for many years while he focused on running and leaving another company he founded – Barracuda Networks.

Today, the company believes that Holodeck, the hybrid cloud data management software developed for EDA companies, is applicable to other industries where tens of millions of files are affected.

Holodeck offers caching of P2P files in the public cloud of a local NFS filer – and IC Manage states that all local file-based workloads that need to burst into the public cloud can use it. To that end, the company is working with several companies, including a large Hollywood film studio, on potential film production, special effects, and post-production applications.

Let’s take a look at IC Manage Holodeck and see why it might be expandable from semiconductor design companies to the general enterprise market.

For EDA companies, there can be tens of millions of files involved in IC circuit simulations that run on their local computer server farm. The NFS filer – such as NetApp or Isilon – could be a bottleneck and slow down file access at the level of more than a million files. IC Manage EVP Shiv Sikand told us he had little respect for NFS file stores: “NFS has terrible latency; it doesn’t scale … I’ve never met a filer that I liked. ”

Holodeck speeds up file access workloads by caching file data on a peer-to-peer network. The cache uses NVMe SSDs in local data centers. A similar cache is also created in the public cloud so that local workflows are pushed to the cloud. In this scenario, the Holodeck in-cloud cache acts as an NFS filer.

The enterprise

IC Manage was founded in 2003 by CEO Dean Drako and Shiv Sikand. The company says it is largely self-funded. Drako founded and ran Barracuda Networks from 2003 to 2012 with an IPO in 2013 and lists himself as an investor. He invested in the startup InfiniteIO to accelerate file metadata.

Drako also operates Eagle Eye Networks, which acts as a video surveillance management facility in the public cloud. Together with Sikand he co-founded Drako Motors, which builds the limited edition 4-seater electric super sports car Drako GTE. Only 25 are built and sold for $ 1.25 million apiece.

This means that neither Drako nor Sikand are full time at IC Manage. This isn’t your classic Silicon Valley startup with hungry young entrepreneurs struggling in a garage. But Sikand shrugs his shoulders: “Is Elon Musk part-time at Tesla?”

Holodeck on site

Shiv Sikand

In a DeepChip forum post in 2018, Sikand said that IC Manage customers “invested $ 10 million to $ 100 million and man-hours in local workflows and EDA tools, as well as scripts and methodologies and math farms” that they know work . You don’t want to disturb or disrupt this structure.

The “big-chip design workflows,” he wrote, are “typically mega-connected NFS environments consisting of 10 million files and easily 100 terabytes.” These can be:

  • EDA Vendor Tool Binaries – with trees that have been installed for years
  • Hundreds of foundry PDKs, third-party IP addresses, and legacy reference data
  • Massive chip design databases with timing, extracted parasites, stuffing, etc.
  • A sea of ​​setup and run scripts to link all the data together
  • A nest of configuration files to define the right environment for each project.

Sikand told us, “This is 10 million horribly intertwined on-premise files.”

Holodeck partially speeds up access to file data because it “separates data from entire files and only transfers extents,” Sikand wrote in his forum post. Holodeck speeds up the workload by creating a scalable peer-to-peer cache with 50 GB to 2 TB NVMe SSDs per server node. These are located in front of the bottleneck NFS filer and enable parallel access to file data in the cache. All nodes in the computing grid share the local data – both bare metal and virtual machines. Additional NVMe flash nodes can be added to scale performance.

Holodeck peer-to-peer caching fabric

According to IC Manage, this ensures consistently low latency, high-bandwidth parallel performance and can be expanded to thousands of nodes.

The P2 cache can be set up remotely – and it can also be set up on the public cloud using a holodeck gateway to help with the bursting of compute workload.

Sikand told us that people criticize the P2P approach on the grounds that “nobody else is doing this. You must be wrong. “His response to the critics – who are” not the sharpest knives in the box “- is that this works for semiconductor design and manufacturing companies.

Holodeck in the public cloud

The P2P cache approach is used on the public cloud, but Holodeck uses it locally. The Holodeck software sets up a cloud gateway system to collect NFS filer information and send it to a database system called a “tracker”. Public cloud peer nodes in the P2P facility use trackers, and computing instances – so-called holodeck clients – access these peer nodes. The computation entities and the peer nodes are both scaled.

Holodeck and the hybrid cloud

The tracker database can be provided in block storage. The peer nodes use small amounts of temporary storage with high-performance NVMe devices such as AWS i3 instances.

The Holodeck client compute nodes access the P2P cache as if it were an NFS filer, and job workflows run in the cloud as they are on-site. With Holodeck’s peer-to-peer sharing model, each node can read or write data generated on another node. This release model has the same semantics as NFS and thus ensures full compatibility with all local workflows.

The good thing is that the local file data is never stored outside of the caches in the public cloud. There is no second copy of all local files in the clouds, which lowers costs. All changes to the tracker database that were made as a result of the cloud compute jobs are tracked by Holodeck, and only these changes are sent back locally, which means that the fees for exiting the cloud are low.

Holodeck supports selective write-back control to prevent users from accidentally pushing unnecessary public cloud data back over the cables to the local base.

Data coherence between the peer nodes is maintained using the distributed RAFT consensus protocol, which can withstand node failures. If a majority of the peer nodes do the same, e.g. For example, updating a file metadata value is essentially the only version of the truth and a failed node can be bypassed.

competition

Panzura’s Freedom Cloud NAS offering runs on-premises and in the public clouds. This enables files to be transferred between distributed EDA systems to enable remote site access to files at another site. It uses compression and deduplication to reduce the amount of data required to transfer files.

Nasuni also offers a NAS in the cloud with local access.

Once an EDA customer has a scale-out file system that supports flash drives such as one from Qumulo or WekaIO, the need for on-site holodeck is reduced. When this file system runs in the public cloud, the need for holodeck hybrid cloud is further reduced.

However, returning only updated data from the cloud through Holodeck (delta changes) minimizes cloud exit fees and this could be valuable. Sikand has this to say about WekaIO: “WekaIO is incredibly fast, but it makes your nose bleed when you pay the bill.”

Source link