JDBC Federator
Applicable to Sisense on Linux
JDBC Federator is a data connector that can combine multiple single-tenant databases or schemas into a single multi-tenant ElastiCube. This is useful, for example, for performing cross-analyses between different tenants. If you have multiple clients with their own database schemas, creating an ElastiCube for each will be difficult to manage, and writing a Union statement isn't scalable. Using JDBC Federator solves these problems.
JDBC Federator is built on the JVM Connector framework and is conceptually similar to the Generic JDBC connector. For JDBC Federator to work, it requires a JDBC driver for the databases you want to connect to. You can download JDBC drivers from here.
Support for JDBC Federator is available in Sisense for Linux deployments, version L8.0.3.148 and later.
This document contains the following main sections:
Setting Up JDBC Federator
Setting up the JDBC Federator requires several steps:
- Editing the description.json File
- Editing the usedConnectors.json File
- Setting Up the jdbcdrivers and tenants Folders
- Restarting the Connector Pod
- Configuring the tenantdetails.csv File
- Configuring the jdbc_queries.properties File
- Setting Up Incremental Load
- Setting the Federator Connection URL
Editing the description.json File
To edit the description.json file:
- In Sisense, click Admin and navigate to System Management > File Management.
Note: On your local Linux machine, this maps to /opt/sisense/storage.
Important: Do not add or remove any files or folders from the root directory. - Navigate to connectors > federator > description.json.
- Append the following two lines to the payload section:
"additionalClassPaths": ["/opt/sisense/storage/connectors/federator/jdbcdrivers/*"], "jvmParameters" : ["-Xmx8G"]
Note: Set the RAM allocation (-Xmx8G) in GB appropriately. For example, 4G, 8G, or 12G.
The description.json file is now as follows:
{
"fileVersion": "config://sisense/connector/federator#1",
"configurationType": "json",
"payload": {
"provider": "federator",
"displayName": "Federator",
"version": "1.1.0",
"category": "Database",
"jdbcClass": "com.sisense.connectors.jdbc.JayDBCQueryFederator",
"isAccumulativeSupported": true,
"isWebECMSupported": true,
"startAsUser": false,
"additionalClassPaths": ["/opt/sisense/storage/connectors/federator/jdbcdrivers/*"],
"jvmParameters" : ["-Xmx8G"]
}
}
Editing the usedConnectors.json File
To edit the usedConnectors.json file:
- In Sisense, click Admin and navigate to System Management > File Management.
- Navigate to connectors > usedConnectors.json.
- Append “federator” to both of the arrays:
"availableConnectors": [..., ”federator”]
"displayConnectors": [..., ”federator”]
The usedConnectors.json file should now look like this:
{
"customConnectors":true,
"availableConnectors":[
"athena",
"EC2EC",
"CSV",
"GoogleAnalytics",
"GenericJDBC",
"Google Spreadsheets",
"sql",
"Excel",
"GoogleAdwords",
"SalesforceJDBC"
”federator”
],
"displayConnectors":[
"athena",
"EC2EC",
"CSV",
"GoogleAnalytics",
"GenericJDBC",
"Google Spreadsheets",
"sql",
"Excel",
"GoogleAdwords",
"SalesforceJDBC"
”federator”
],
"externalConnectorsPath":"/opt/sisense/storage/connectors"
}
Setting Up the jdbcdrivers and tenants Folders
To set up the jdbcdrivers and tenants folders:
- Download the following files:
- Save the downloaded files to System Management > File Management > connectors > federator.
- In that location, create two new folders:
- tenants
- jdbcdrivers
- Add the relevant JDBC driver JAR file to the jdbcdrivers folder.
Restarting the Connector Pod
Restart the Kubernetes Connector Pod. Note that a restart is required when:
- You updated the descriptor.json file
- You add or update the jdbc driver JAR file
You don't have to restart the Connector Pod when you update the tenantdetails.csv and jdbc_query.properties files (next steps).
Configuring the tenantdetails.csv File
The tenantdetails.csv file lists the details for each tenant and is saved here:
/opt/sisense/storage/connectors/federator/tenantdetails.csv.
The CSV file must contain at least one tenant column, which typically maps to your database schemas. The tenant column must contain unique values.
If your tenants are located across multiple database servers with different usernames and passwords, you can add additional columns for each.
You can add as many columns as you like, based on the variables across tenants.
The following is an example of a tenantdetails.csv file:
Configuring the jdbc_queries.properties File
- Navigate to System Management > File Management > connectors > federator.
- Open the jdbc_query.properties file for editing.
- In the table.names property, add a comma-delimited list of tables.
- For each value in the table.names list, add a corresponding entry for its query. In the example, below, there's a table called table.names=Customer,Invoice and so there are corresponding query entries for Customer.query and Invoice.query.
Notes:- The table names are logical names and don't need to match the database names
- Don't leave any spaces in the tables.names list
Any value from the tenantdetails.csv file is reverenced using the <column name> format. For example: <tenant>
- Make sure that the details for each of the properties are correct.
# tenant csv file location
tenant.config.file=/opt/sisense/storage/connectors/federator/tenantdetails.csv
# folder where the incremental state of the tables is stored
#tenant.incr.folder=/opt/sisense/storage/connectors/federator/tenants
# table names as a comma delimited list, do NOT leave spaces
table.names=Customer,Invoice
# sql queries for each table
Customer.query=SELECT '<tenant>' as "Tenant", * from <tenant>.Customer
Invoice.query=SELECT '<tenant>' as "Tenant", * from <tenant>.Invoice
# column names for incremental loads (optional)
#xxxx.increment.col=CreatedDate
#yyyy.increment.col=ID
# misc settings
incr.date.format=yyyy-MM-dd HH:mm:ss.SSS
fault.tolerance=Y
connection.test.query=SELECT 1
#For DB2
#connection.test.query=SELECT * from SYSIBM.SYSDUMMY1
#For PostGres
#record.fetch.size=100000
Setting Up Incremental Load
When you run a build, the ElastiCube reloads the entire data set. If you have a lot of data, it could take a long time to load. An incremental load, however, only loads new data. This configuration instructs JDBC Federator to look at specific columns (such as CustomerID or InvoiceDate) to determine what the latest data is, and to only load that data.
To set up incremental load:
- Navigate to System Management > File Management > connectors > federator.
- Open the jdbc_query.properties file for editing.
- Specify the incremental columns for each table you want to load incrementally. For example:
customer.increment.col=CustomerID
invoice.increment.col=InvoiceDate - In Sisense, in the ElastiCube, set the table build behavior to Append.
Setting the Federator Connection URL
Federator appears in the list as an available connector.
Set the connection URL for Federator as follows:
JDBC URL|ConfigFile=<path>/jdbc_query.properties
For example:
jdbc:redshift://redshift.aws.com/sales|ConfigFile=/opt/sisense/storage/connectors/federator/jdbc_query.properties
Note: The connection URL can contain dynamic elements from the tenantdetails.csv file.
For example:
jdbc:redshift://redshift.aws.com/<tenant>|ConfigFile=/opt/sisense/storage/connectors/federator/jdbc_query.properties
Troubleshooting
Connection Error - Could Not Connect to DB. java.lang.ClassNotFoundException:
Check the following
- The JDBC driver JAR file for the database is uploaded to the jdbcdrivers folder
- additionalClassPaths entry is added to the descriptor.json file
- The connector pod was restarted after the settings were made
Can connect but don't see any tables.
Check the following:
- The ConfigFile property is appended to the JDBC URL
- The tenantdetails.csv file and the jdbc_query.properties file exist
- The SQL queries in jdbc_query.properties are valid.
- Detailed error message can be found on the logs for the connector pod
Viewing Logs
Logs are located in:
/var/log/sisense/sisense/<connector pod logs>
You can also check the logs within the connector pod:
kubectl -n sisense logs <connector pod name> --since=5m
