Preparing Your Excel Files
Applicable to Sisense on Linux and Microsoft Windows
Use the following guidelines to prepare and verify Excel files before connecting them with the ElastiCube.
Learn more about Sisense's Excel connector.
Structure Your Excel Sheet
All Excel data must be imported into Sisense in a tabular format, to allow for accurate data categorization.
Characteristics of Tabular Data include:
- Each sheet represents a class, for example customers, sales, employees.
- Each column represents a variable, for example revenue, name, address.
- Each row represents a single observation, for example customer X’s details.
Remove Blank Rows and Duplicate Headers
Remove blank rows OR rows after the first column that have duplicate headers. Blank rows may return incorrect calculations and results.
Blank Rows: Delete rows 4 and 8 if they contain no data.
Additional Headers: Delete row 13 as it has the same titles as the main headers. If this heading represented a different class of data, a new sheet should be created.
Remove All Unstructured Data
Remove all unstructured data from each sheet, such as charts, images or non-data elements. Unstructured data cannot be imported, and may result in other data being obscured.
Images, Charts and Non-Data Elements: All non-data elements must be removed from the sheet, such as the chart in the image above. All visualization based on Excel data can be created in a dashboard using Sisense.
Rename/Remove Duplicate Column Names
Rename or remove columns with the same name. Ensure all columns have a unique name. Columns with the same name cannot be imported into the ElastiCube.
Column Names: Rename columns with the same name, and ensure all columns have a unique name. For example, in the above example ‘CompanyName’ appears twice. Once column should be given a different name such as ‘Secondary CompanyName’. In the ElastiCube, you can rename any column as you wish.
Modify Columns for Quicker Analysis
Consolidate columns as much as possible into a single variable. For example, instead of having a column for each month of the year, it is best to have a single column with all dates. Unconsolidated columns limit the ability to analyze and segment data.
Consolidate Columns: Data should be consolidated into a single column for each variable. In the example above instead of having revenue reflected separately as a column for each month, it is best to consolidate the data into a single column for date and another for revenue. This logic should be applied to any other segments such as summary by country, or channel, and so forth.
Additional Methods and Tools for Consolidating (Transposing) Columns
To consolidate or transpose columns in your files, you can use one or both of the following methods:
- Use an Excel macro to transpose data. This is useful when your data is already in Excel or easy to put into Excel.
- Create custom SQL. This method should be used if importing your data into Excel is more difficult, or the amount of data is causing poor performance in Excel. This method requires basic SQL knowledge.
To learn more about both of the above methods, and to download the macro, click here.
Convert Scientific Formatting to Numeric Formatting
Convert cells with scientific formatting to another number format. Data in scientific format cannot be imported correctly into the ElastiCube.
Scientific Format: Convert all values in scientific format to another number format such as number, currency, accounting, fraction and so forth.
Create Identical Files when Appending Several Files
When appending data from several Excel files together, the data in the files must have the same column structure and data types, and the sheets you want to append must all have the same name.
For example, the three files below have the same structure and data types, but different sheet names (Jan, Feb, March). Sisense cannot append the data in these files, because it does not recognize the files as identical.
However, if you rename the sheets so they all have the same name (Sheet1), Sisense recognizes these files as identical, and can successfully append the data in these files.
When importing an Excel spreadsheet with the XLSX format, if you get the error: "Unable to connect to the specified file. Corrupt OpenXML document.", try saving the file as XLS and importing it again. If this works, keep in mind that in an XLS workbook, the row limit is 65,536 (216) and 256 columns (28) which correspond to the column IV. With XLSX workbooks, limits are 1,048,576 rows (220) and 16,384 columns (214).