- 19 Jul 2022
- 6 Minutes to read
- Updated on 19 Jul 2022
- 6 Minutes to read
The ElastiCube is Sisense’s unique, high-performance analytics database with super-fast data stores that are specifically designed to withstand extensive querying typically required by business intelligence applications.
ElastiCubes allow you to bring in data from multiple sources, and then merge, manipulate and query the data as if it was one consolidated data set. ElastiCubes perform so well, that in most cases the creation of dedicated OLAP cubes and/or optimized data marts are completely unnecessary – even when dealing with hundreds of millions of rows of raw data.
One of the biggest advantages of ElastiCubes is the ability to easily mash up multiple data sources. It is made up of fields where each value in one field has a corresponding value in another field. The data for an ElastiCube can come from one source, multiple sources or even from multiple physical locations. Once the data is inside the ElastiCube , it is all the same and every field coming from every table can be analyzed in the context of any other – quickly.
ElastiCube technology make queries over hundreds of millions of rows of raw data return in seconds, with moderate hardware requirements including standard desktop-class computers with commodity hardware. More importantly, ElastiCubes can do this without having to pre-aggregate and pre-calculate the data ahead of time and store it on the hard-drive, thus radically reducing required import/processing time and storage space.
ElastiCubes are most useful when one or more of the following is true:
- Large amounts of data need to be analyzed
- Data for analysis originates from multiple disparate sources
ElastiCubes – Technical Overview
Relational databases (RDBMS) like SQL Server, Oracle, MySQL and even Access all store tabular data row-by-row. This structure is best for transactional/operational systems that require large numbers of concurrent insertions. With indexes, it can also provide realistic query response times for row-based queries that do not frequently require aggregations or joining of many tables.
Data analysis often requires aggregation of data as well as merging of data located in multiple disparate tables. When dealing with these types of queries, relational databases reach their limits pretty quickly. The only way to extend these limits is by putting in stronger hardware and pre-aggregating data to reduce the amounts of calculations that occur in real time.
The ElastiCube Columnar Database
ElastiCube data is held in a Columnar Database Management System (CDBMS) that stores data field-by-field. Each field is individually stored in a memory-mapped file.
When a query is executed over an ElastiCube, only fields referenced in the query need to be loaded into memory. This leaves enough space for actually processing the query entirely in memory without any read/write to the hard-drive – the prime reason for poor performance of queries. Once a field is no longer used, it is removed from memory and its consumed space is freed.
This approach has several advantages:
Query Response Time
Queries over data sets containing millions of rows of data return in seconds even under modest hardware configurations such as desktop computers.
ElastiCubes do not require pre-aggregations and/or creation of indexes to assure fast query response, therefore the actual creation of an ElastiCube takes a fraction of the time of a data mart or an OLAP cube.
Pre-aggregations and the creation of indexes are not needed to assure fast query response, making an ElastiCube ’s size significantly smaller than a datamart or an OLAP cube.
This columnar storage strategy makes the data much more suitable for high levels of compression, without loss of detail or accuracy. This means that less hardware, disk space, and RAM is needed than would be for an equivalent-sized, traditional Business Intelligence database.
Written and designed to natively support 64-bit processing, the 64-bit architecture vastly increases the amount of memory the system can address at any given time. This means you can work with virtually unlimited amounts of data.
True Multi-User, Multi-Application Architecture
ElastiCubes are not tightly coupled with the application layer of the system. This frees up a single ElastiCube to handle multiple applications and users. Not having to reproduce your data model for every application saves significant time developing and maintaining your dashboards and reports.
Just-In-Time, In-Memory Processing
Smart Cache and Instruction Recycling
CPU cycles and RAM space are the two most precious resources in any computer, and ElastiCube is designed to use both as efficiently and speedily as possible. Using our sophisticated caching algorithm, the data is only loaded into memory when it’s needed. As part of this algorithm, compute- and time-intense calculations are also intelligently cached to further reduce I/O calls.
Additional sophisticated algorithms further increase Sisense’s performance. Once data is loaded into memory, the main performance bottleneck becomes CPU cache misses that naturally come with random access. The ElastiCube is specifically designed to minimize these errors by employing a unique cache-aware algorithm, further increasing Sisense’s performance by an additional order of magnitude.
Every DB compresses data to save disk space and RAM. ElastiCube is designed to work directly on this compressed data, so that the need for decompression is virtually eliminated, further increasing ElastiCube ’s performance.
Designed with Standard Hardware in Mind
Just about every new computer on the market—even portables like iPhones and iPads—are built with very powerful multi-core processors, putting several CPUs into one. ElastiCube was built specifically to take advantage of these powerful CPUs, further increasing Sisense’s performance on standard hardware, enabling you to run multiple applications and support multiple users.
The thing we know for sure about DBs is that they grow. Fast. So no matter how much fancy footwork is done with completely in-memory DBs, eventually you run out of RAM space and need to upgrade—at least your RAM (best case) or your entire hardware platform (worst, very expensive, case). At Sisense we know this, so we spent years designing the ElastiCube to be able to handle terabytes—billions of rows—of data efficiently and quickly, even on standard PC hardware.
Unified Analytics Engine
Sisense can execute queries against a wide variety of data sources as if they were all of the same type, essentially making the individual characteristics of each physical data source unimportant. Our Unified Analytics Engine is what makes this possible.
When Sisense imports data, the Unified Analytics Engine creates a metadata layer, or abstraction layer, which is then used to formulate queries across any number of tables from any number of data sources in any number of formats. It even supports the combined querying of resident and external (live) database sources without first loading data into the database!
These capabilities provide the user with unparalleled flexibility and speed in creating, executing and sharing highly complex reports, dashboards, and analytic applications, with any number and variety of data sources.
Compliant with Industry Standards
Supports SQL-92 Standard
Even with all this advanced technology, Sisense knew that none of it would be any good if users couldn’t access their existing data. So, we built in an SQL layer to the system, which allows users to integrate Sisense to external applications without needing to learn new scripting languages.
Seamless Integration with Existing Data Sources
Got an ODBC/OleDB compliant DB today? Great, we built in the ability to access those, too. ElastiCube will seamlessly connect to those data sources so, again, there is no need to learn a new language or write special code to connect to your existing data. With ElastiCube there’s no need to start over, you just get faster, easier, and more scalable, with minimal need for IT.