XINFINI Technology | XSEA Architecture
Comprehensively assist enterprises in all-data/all-flash transformation
Redefine distributed storage architecture and comprehensively promote the popularity of all-flash data centers
Customer pain points

The era of all-flash data centers has arrived, bringing both opportunities and challenges

As the performance and capacity of NVME SSD continue to improve and the price continues to drop, with the popularity of high-performance lossless network 25Gb/100Gb and the emergence of 400Gb, with the popularity of NVMe over RoCE technology (accessing storage from any location in the data center only requires microsecond-level latency), which has accelerated the arrival of all-flash data center architecture. The all-flash data center architecture can provide enterprise-level customers with significant performance improvements, cost savings and business flexibility, thereby helping enterprises gain advantages in a highly competitive market environment. However, existing all-flash storage has the following pain points:


Problems with local NVMe DAS
  • TCO remains high, storage capacity and performance utilization are uneven, and resources are wasted seriously
  • Operation and maintenance management is difficult. There are a large number of local disks on the server. There is a lack of unified hard disk management tools and the inability to automatically warn and handle storage media failures and sub-health problems, resulting in high operation and maintenance costs
  • Low reliability, no data redundancy protection on local disk
Problems with traditional centralized all-flash array
  • Difficulty in horizontal expansion and inability to decouple software and hardware
  • Poor openness and unable to meet the requirements of the cloud trend
  • Poor economics, unable to meet the requirements of high performance, large scale, and economical performance at the same time, customers cannot enjoy SSD value dividends
Shared-Nothing storage problem
  • It cannot meet the requirements of all-scenario business. It relies on three copies and does not have the ability to reduce data. As a result, the cost per TB of available capacity is too high, and customers cannot obtain the huge advantages of all-flash capacity at an appropriate cost
  • When an SSD slow disk or node failure occurs, the switching time is as high as 5 seconds or more, and the performance fluctuates greatly, making it unable to meet the demanding OLTP key business service quality
Architecture overview

XSKY eXtreme Shared-Everything Architecture

XSKY designed a new revolutionary eXtreme Shared-Everything Architecture (referred to as: XSEA) by drawing on the Shared-Everything architecture of high-end storage and XSKY's years of experience in distributed storage. The standard storage protocol and network technology will subvert the storage hierarchy of the data center, replace some NVMe DAS and hybrid flash storage, and solve the performance, reliability, and scale problems of the traditional Shared-Nothing architecture in the past 20 years through breakthrough methods. , cost compromise issues.

Core advantages

Provide all-flash storage products with high reliability, high performance, and high efficiency capabilities

XSEA Architecture has achieved three 100s through three technological innovations: Shared-Everything, single-layer flash media, and end-to-end NVMe:

Shared-Everything

Fully shared data storage

100ms failover time

Single layer flash media

For TLC NVMe SSD

100% Effective storage ratio

End-to-end NVMe

Maximize hardware offloading

100µs ultra-low latency

Shared-everything data storage, ultimate high reliability

XSEA architecture adopts the "Shared-Everything" model to achieve fully shared data storage, allowing each node to directly access all SSDs to improve data access speed and flexibility. In slow disk and sub-health scenarios, failover can be completed quickly within 100ms.

Shared-Everything vs Shared-Nothing: Performance scales linearly
Shared Nothing
  • Each node processes data independently
  • As the number of nodes increases, the benefits of expansion will be limited
Shared Everything
  • Each node does not need to communicate with other node services
  • As the number of nodes increases, it can support linear expansion capabilities


Shared-Everything vs Shared-Nothing: Flexible resource allocation
Shared Nothing
  • The resources of each node cannot be used uniformly
  • In the early stages of planning the system, a large amount of resources need to be reserved in advance, resulting in waste
Shared Everything
  • Storage capacity and performance are decoupled from CPU and memory resources
  • Allocate resources on demand according to actual business scenarios


Shared-Everything vs Shared-Nothing: global perspective scheduling
Shared Nothing
  • Each node is divided into independent units
  • Partial perspective, affecting business stability
Shared Everything
  • Global data reading and writing capabilities of each node
  • Global flow control greatly improves space utilization efficiency


Shared-Everything vs Shared-Nothing: Higher quality of service
Shared Nothing
  • Lack of business level sensitivity
  • In sub-healthy condition, the failover time takes 10 seconds
Shared Everything
  • Guaranteed business continuity
  • In sub-healthy condition, the failover time takes 100ms


Optimized for TLC NVMe SSD, with over 100% effective storage ratio

The XSEA architecture uses a single-layer TLC NVMe SSD to build a storage pool, simplifying the storage hardware structure of the cluster. In terms of usage, the Append Only method is used to write data, reducing the write amplification phenomenon. And through a carefully designed space layout, the dual functions of cache and persistent storage are achieved on a single SSD. These technologies ensure sufficient performance stability without a dedicated cache medium.

In conventional mixed read-write business scenarios, compared with the layered cache method, single-layer flash memory can significantly reduce media costs by more than 20%. At the same time, combined with the global EC and compression functions brought by the shared-everything architecture model, the cluster's effective storage ratio exceeds 100%.



End-to-end NVMe design for IO paths, achieving ultra-low latency of 100µs

On the end-to-end IO path, the XSEA architecture is built using the standard NVMe over Fabrics protocol. In addition to the client accessing storage using the NVMe over Fabrics standard protocol, the storage internal interconnection network also uses the NVMe over Fabrics standard protocol. It is a complete end-to-end NVMe design, which means that all storage nodes can efficiently access each NVMe SSD through NVMe over Fabrics, thereby avoiding the additional overhead caused by storage protocol conversion. On the end-to-end NVMe I/O path, the XSEA architecture also uses an efficient Polling mode to process each I/O request, and optimizes the memory access efficiency of different services through NUMA binding, and finally achieves an end-to-end latency as low as 100 microseconds. This allows any location in the data center to access the Xinfini all-flash storage with only microsecond latency.

Data persistence layer

The data Persistent Layer provides data persistence services to the upper layer and has 3 core designs:

  • Provide AppendLog semantic calls to the upper layer;
  • Provides ultimate write latency;
  • Chunk metadata is centrally managed.
Data service layer

The data service layer Service Layer provides block storage services and file storage services to the upper layer, including BlockServer and FileServer.

BlockServer is a storage engine for block storage. Based on the high-performance read and write Chunk interface provided by Persistent Layer, it uses Log-Structured to organize data and abstract the virtual block layer. Supported storage access protocols are NVMe/RoCE, NVMe/TCP, iSCSI and KVM vhost-blk.

Protocol access layer

BlockServer provides NVMe over RoCE/TCP Target externally for client access.

BlockDataClient is a private client, deployed on KVM computing nodes, and provides vhost-blk block storage protocol interface for KVM.

XINFINI The name of XSKY’s new generation of all-flash technology
XSEA The name of XSKY’s new generation all-flash architecture
eXtreme Shared-Everything Architecture The abbreviation is "XSEA", which means extremely fast fully shared architecture, that is, "XSEA Architecture"
Shared-Everything Architecture Fully shared architecture, a type of distributed system architecture
Shared-Nothing Architecture Shared nothing architecture, a type of distributed system architecture
Persistent Layer (data) persistence layer
Service Layer (data) service layer
Access Layer (Protocol) Access Layer
AppendLog Write An important technique used to ensure data consistency and fault tolerance in distributed systems.
QAT Intel Xeon Scalable processor built-in hardware accelerator for compression and decompression operations
NVMe DAS NVMe direct attached storage means that the server uses a local NVMe disk
NVMe Non-Volatiltee Memory Express, Storage over PCle protocol
RDMA Remote Direct Memory Access, remote direct memory access network protocol
CE Converged Ethernet, lossless Ethernet network
NVMe-oF Target NVMe over Fabrics storage side
NVMe-oF Initiator NVMe over Fabrics client
RoCE Stands for RDMA Over Converged Ethernet. RDMA over Converged Ethernet (RoCE), which allows Remote Direct Memory Access (RDMA) over Ethernet