Node and disk performance monitoring latency

ECS Overview and Architecture February 2022 H14071.21

Abstract This document provides a technical overview and design of the Dell ECS software-defined cloud-scale object storage platform.

Copyright © 2015-2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Intel, the Intel logo, the Intel Inside logo and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. Other trademarks may be trademarks of their respective owners. Published in the USA February 2022 H14071.21.

Dell Inc. believes the information in this document is accurate as of its publication date. The information is subject to change without notice.

Executive summary

Executive summary

Scope This document provides an overview of the Dell ECS object storage platform. It details the ECS design architecture and core components such as the storage services and data protection mechanisms.

This document focuses primarily on ECS architecture. It does not cover installation, administration, and upgrade procedures for ECS software or hardware. It also does not cover specifics on using and creating applications with the ECS APIs.

4
4	ECS Overview and Architecture

Value of ECS

We value your feedback

Value of ECS

Value of ECS

• Reporting, policy and event based record retention and platform hardening for SEC

Rule 17a-4(f) compliance including advanced retention management such as

• Integration with monitoring and alerting infrastructure (SNMP traps and SYSLOG)

• Enhanced enterprise capabilities (multi-tenancy, capacity monitoring and alerting)

• Small and large file performance

• Seamless Centera migration

The design of ECS is optimized for the following primary use cases:

• Modern Applications - ECS designed for modern development such as for next-

• Secondary Storage - ECS is used as secondary storage to free up primary storage

of infrequently accessed data, while also keeping it reasonably accessible.

cloud for archival and long-term retention purposes. Using ECS as an archive tier

can significantly reduce primary storage capacities. To allow for better storage

• Global Content Repository - Unstructured content repositories containing data such as images and videos are often stored in high cost storage systems making it impossible for businesses to cost-effectively manage massive data growth. ECS enables consolidation of multiple storage systems into a single, globally accessible and efficient content repository.

• Storage for Internet of Things - The Internet of Things (IoT) offers a new revenue opportunity for businesses who can extract value from customer data. ECS offers an efficient IoT architecture for unstructured data collection at massive scale. With no limits on the number of objects, the size of objects or custom metadata, ECS is the ideal platform to store IoT data. ECS can also streamline some analytic workflows by allowing data to be analyzed directly on the ECS platform without requiring time consuming extract, transform and load (ETL) processes. Hadoop clusters can run queries using data stored on ECS by another protocol API such as S3 or NFS.

This section will go in-depth into the ECS architecture and design of the software and hardware.

ECS Overview and Architecture 7

• Storage Engine - Core service responsible for storing and retrieving data, managing transactions, and protecting and replicating data locally and between sites.

• Fabric - Clustering service for health, configuration and upgrade management and alerting.

8	Figure 1.	ECS architecture layers
8	ECS Overview and Architecture	ECS architecture layers

• Performance monitoring on latency, throughput, and replication progress.

• Diagnostic information, such as node and disk recovery status.

Detailed performance reporting is available in the UI under the Advance Monitoring folder. The reports are displayed in a dashboard. There are filters available to drill into specified Namespaces, Protocols or Nodes. The following figure shows an example of an S3 protocol performance report:

ECS Overview and Architecture 9

ECS supports the following event notification servers which can be set using the web UI, API or CLI:

• SNMP (Simple Network Management Protocol) servers

10 ECS Overview and Architecture

Architecture

			Interoperability
Object		Additional capabilities like Byte Range Updates and Rich ACLS
			NFS (path-based objects only and not object ID style based)
	Swift
			N/A
File
File		NFSv3	S3, Swift, HDFS, Atmos (path-based objects only and not object ID style based)

ECS provides a facility for metadata search for objects using a rich query language. This is a powerful feature of ECS that allows S3 object clients to search for objects within buckets using system and custom metadata. While search is possible using any
metadata, by searching on metadata that has been specifically configured to be indexed in a bucket, ECS can return queries quicker, especially for buckets with billions of objects.

Metadata search with tokenization allows the customer to use metadata search to search for objects that have a specific metadata value within an array of metadata values. The method must be chosen when the bucket is created. It can be included as an option when creating the bucket through the S3 create bucket API, and include the header x-emc-metadata-search-tokens: true in the request.

The impact to operations increases as the number of indexed fields increases. Impact to performance needs careful consideration on choosing if to index metadata in a bucket, and if so, how many indexes to maintain.

For CAS objects, CAS query API provides similar ability to search for objects based on metadata that is maintained for CAS objects which does not need to be enabled explicitly.

ECS can store Hadoop file system data. As a Hadoop-compatible file system,
organizations can create big data repositories on ECS that Hadoop analytics can consume and process. The HDFS data service is compatible with Apache Hadoop 2.7, with support for fine-grained ACLs and extended filesystem attribute.

ECS has been validated and tested with Hortonworks (HDP 2.7). ECS also has support for services such as YARN, MapReduce, Pig, Hive/Hiveserver2, HBase, Zookeeper, Flume, Spark, and Sqoop.

Architecture

Privacera implementation with Hadoop S3A

Privacera is a third-party vendor that has implemented a Hadoop client-side agent and integration with Ambari for S3 (AWS and ECS) granular security. Although Privacera supports Cloudera Distribution of Hadoop (CDH), Cloudera (another third-party vendor) does not support Privacera on CDH.

Hadoop S3A security

ECS IAM allows the Hadoop administrator to setup access policies to control access to S3A Hadoop data. Once the access policies are defined, there are two user access options for Hadoop administrators to configure:

 Create IAM roles that attach to policies

 Configure CrossTrustRelationship between Identity Provider (AD FS) and ECS that map AD groups to IAM roles

For more information about ECS IAM, see ECS IAM.

ECS HDFS client support

Figure 5. ECS serving as name and data nodes for a Hadoop cluster

Other enhancements added in ECS for HDFS include the following:

ECS includes native file support with NFSv3. The main features for the NFSv3 file data

service include:

• Multiprotocol access - Access to data using different protocol methods.

NFS exports, permissions and user group mappings are created using the WebUI or API.

this workflow.

ECS Overview and Architecture 15

• No design limits on the number of files or directories.

• File write size can be up to 16TB.

basically mapped to an object, NFS has features like the object data service, including:

• Quota management at the bucket level.

Connectors and gateways

Several third-party software products have the capability to access ECS object storage.

• Isilon CloudPools - Policy-based tiering of data to ECS from Isilon.

• Data Domain Cloud Tier - Automated native tiering of deduplicated data to ECS

and servers.

Storage engine At the core of ECS is the storage engine. The storage engine layer contains the main

Storage services

The ECS storage engine includes the services shown in the following figure:

The services of the Storage Engine are encapsulated within a Docker container that runs on every ECS node to provide a distributed and shared service.

Data

 Identifiers and descriptors - A set of attributes used internally to identify objects and their versions. Identifiers are either numeric ids or hash values which are not of use outside the ECS software context. Descriptors define information such as type of encoding.

 Encryption keys in encrypted format - Data encryption keys are considered system metadata. They are stored in encrypted form inside the core directory table structure.

	Timestamps - Attribute set that tracks time such as for object create or update.	17

	ECS Overview and Architecture

Data and system metadata are written in chunks on ECS. An ECS chunk is a 128MB logical container of contiguous space. Each chunk can have data from different objects, as shown below in the following figure. ECS uses indexing to keep track of all the parts of an object that may be spread across different chunks and nodes.

ECS has a built-in snappy compression mechanism. The granularity is 2MB for small objects and 128MB for large objects. ECS employs a smart compression logic where it only compresses data that is compressible, saving resources from trying to compress already compressed or in-compressible data (such as encrypted data or video files). If more sophisticated compression is required, the ECS Java SDK supports client side ZIP and LZMA.

ECS uses a set of logical tables to store information relating to the objects. Key-value pairs are eventually stored on disk in a B+ tree for fast indexing of data locations. By storing the key-value pair in a balanced, searched tree like a B+ tree, the location of the data and metadata can be accessed quickly. ECS implements a two-level log-structured merge tree where there are two tree-like structures; a smaller tree is in memory (memory table) and the main B+ tree resides on disk. Lookup of key-value pairs occurs in memory first subsequently at the main B+ tree on disk if needed. Entries in these logical tables are first recorded in journal logs and these logs are written to disks in triple-mirrored chunks.

Figure 8. Memory table dumped to B+ tree

The following table shows the information stored in the Object (OB) table. The OB table contains the names of objects and their chunk location at a certain offset and length within that chunk. In this table, the object name is the key to the index and the value is the chunk location. The index layer within the Storage Engine is responsible for the object name-to-chunk mapping.


	C1:offset:length
FileB	• C2:offset:length• C3:offset:length


C1	• Node1:Disk1:File1:Offset1:Length• Node2:Disk2:File1:Offset2:Length• Node3:Disk2:File6:Offset:Length

Partition ID	Owner


P3	Node 3

Storage services are available from any node. Data is protected by distributed EC segments across drives, nodes and racks. ECS runs a checksum function and stores the result with each write. If the first few bytes of data are compressible ECS will compress the data. With reads, data is decompressed, and its stored checksum is validated. Here is an example of a data flow for a write in five steps:

1. Client sends object create request to a node.

Figure 10 shows an example the data flow for a read for hard disk drive architecture like Gen2 and EX300, EX500 and EX3000.

1. A read object request is sent from client to Node 1.