Summary

The Problem

As the precision of data and the storage of past data increases, the amount of data increases to the point that legacy Relational Databases cannot handle.

Challenge #1:

Reduced Performance as Dataset Grows

As the amount of data increases and as the business logic changes repeatedly, processing performance gradually decreases, causing problems for the business.

Challenge #2:

Cost Increases as Dataset Grows

Requires specialized high-performance hardware and advanced middleware, increasing the initial investment cost and ongoing maintenance cost.

Legacy Solutions

Purchase and deploy the latest high-performance specialized hardware and advanced middleware.

Performance is improved, but costs skyrocket.

Re-write software using latest techniques (Hadoop, etc.)

High Cost. Difficult to recruit and train engineers.

Why is Unicage Fast?

  1. We do not use middleware with huge overhead
    We use only the core functions of the OS, without any database, runtime or middleware. From this aspect, UNIX/Linux OSes like FreeBSD are excellent since they have compact kernel code and you can select the required peripheral software from the PORTS collection.
  2. USP Unicage commands have been precisely tuned
    We have developed the commands used in the shell scripts in the C language and they control memory and CPU directly. They are extensively tuned, for example by using the SIMD command inline. For this reason, it is tens of times faster than commands written in Java. (This is clear by the difference in the size of the post-compilation assembler code.)
  3. Parallel Processing using Pipelines
    Shell scripts can easily use the "pipe" which is a unique feature of UNIX. By connecting USP Unicage commands with a pipeline you can achieve parallel processing which improves processing speed. In one project for an investment bank, we utilized 95% of CPU in a 16-core machine to process 30 million records 60 times faster than their legacy system.
  4. ush
    in order to eliminate the overhead of the shell itself, we have created our own shell called "ush" which is based on "ash". The same shell script runs 1.7 times faster on "ush" than on standard "bash" . We continue to improve the "ush" shell, for example by changing the implementation of pipes to "mmap" (kernel memory) with ID passing.
  5. Pompa Technology
    in order to search large datasets, we employ directory tree division and memory cache control. Our "Pompa Technology" embeds the search key in the path name, enabling two-layer search at the OS level and the Unicage level. Using this technology we were able to return search results from 10TB of log data (from a Korean search engine) in less than 0.1 second without using expensive appliances.