In the era of digital economy, applications are constantly generating and storing a large amount of data, but these data cannot be used by other programs in time, resulting in "data islands". The birth of data lake solves the problem of "data islands", helps enterprises break down barriers between different business areas, truly integrates data and business, enables enterprises to be more efficient in data use and enhance their decision-making ability.
Enterprises pay more and more attention to the value of data. How should data be processed and stored? If each data application collects, transmits or stores data independently, making it possible for other applications to use the data, it is easy to form data islands, and people are entangled in repeated, boring problems such as where data come from and how to access them.
A universal, flexible data platform is urgent for enterprises to reduce repeated data processing in different data applications, to allow them to focus more on the realization of business-related requirements.
It is not only required to store traditional types of data, but also include any other types of data, and do further processing and analysis on such basis to fully tap data value.
Batch data and flow data supported, and the two types of data integrated, processed and then fed into the lake.
excellent data security mechanism, encrypted data transmission and storage, network and permission security strategies to prevent data leakage and ensure data security.
From acquisition points to a data lake, metadata captured and managed from the perspective of data traceability and data life cycle based on data sensitivity in their life cycle
Open source technologies including Spark, Flink, Kafka, Flume, HBase, Hive supported.
Modular hybrid technology architecture design, allowing users to adapt performance and capacity flexibly to their business volume.