Connecting to SparkSQL

The Sisense SparkSQL connector is a certified connector that allows you to import data from the Apache SparkSQL API into Sisense via the Sisense generic JDBC connector. The SparkSQL connector offers the most natural way to connect to Apache Spark data and provides additional powerful features.

The support for the connector is provided by Sisense and will be assisted by the certification partner's support, if needed. For any support issues or additional functionality requests, contact your Sisense representative or open a request through the Sisense [Help Center](https://sisensesupport.zendesk.com/agent/dashboard . For advanced inquiries specific to driver functionality, you can also contact the certification partner's support directly via support@cdata.com.

After you have downloaded the driver, you can connect through a connection string in Sisense. The connection string is used to authenticate users who connect to the SparkSQL APIs. Once you have connected to SparkSQL, you can import a variety of tables from the SparkSQL API.

This page describes how to download the SparkSQL driver and deploy it, how to connect to SparkSQL with a connection string, provides information about the SparkSQL data model, and more.

Note:

For the list of supported connectors, see Data Source Connectors.

Downloading the SparkSQL JDBC Driver

You can download the SparkSQL JDBC driver here.

For a short video about downloading the driver, see below (the video uses the Box driver as an example).

Note:

  • The driver is certified for Sisense v7.2 and above.
  • Sisense v7.4 and above: Click the above link to download a ready-to-use driver.
  • Sisense prior to v7.4: Click the above link to download a 30-days free-trial of the driver. Contact Sisense for the full license version.

Deploying the SparkSQL JDBC Driver

Prerequisite: The install file (setup.jar) is a Java Application that requires Java 6 (J2SE) or above to run.

To install the driver, double-click the setup.jar file and proceed with the instructions in the installation wizard.

Depending on the machine on which you are accessing the Sisense application, install the driver in one of the following locations:

  • When Sisense is installed on your local machine, deploy the driver locally.
  • For a non-local installation (when accessing Sisense on a remote Windows server, or accessing the Sisense hosted cloud environment), select one of the below methods:

    • Deploy the driver on the Sisense server machine, and then perform all the authentication on the server machine.
    • Deploy the driver on your local machine (or any other machine, as convenient), perform all the authentication on that machine, and then copy the JAR file to the remote server.

      For detailed instructions, see Copying a CData JAR File Installed Locally to a Remote Server.

  • If you are on a Linux deployment, deploy the driver on your local machine (or any other machine), perform all the authentication on that machine, and then copy the JAR file to this location:

    /opt/sisense/storage/connectors/jdbcdrivers/driver_name_folder.

    For detailed instructions, see Copying a CData JAR File Installed Locally to a Remote Server.

Note:

The default location of the JAR file is: C:\Program Files\CData\CData JDBC Driver for <Driver Name> 2019\lib.


For a short video of the process, see below (the video uses the Box driver as an example).

JAVA Troubleshooting

If you do not have Java 6 installed, you may download it from here.

If your system is not set up to run Java applications, execute the following command: java -jar setup.jar.

Connecting to SparkSQL

Sisense uses connection strings to connect to SparkSQL and import data into Sisense . Each connection string contains authentication parameters that the data source uses to verify your identity and what information you can export to Sisense .

To create the connection string:

  1. Open the lib directory for the connector. The default path is: C:\Program Files\CData\CData JDBC Driver for <Driver Name> 2019\lib

  2. Double-click the JAR file in the lib directory.

    Alternatively, to open the JAR file from the command line, enter the following command in the command prompt (change the driver name to your driver):

    cd C:\Program Files\CData\CData JDBC Driver for <Driver Name> 2019\lib

    Press Enter and then enter the following command (change the driver name to your driver):

    "C:\Program Files\Sisense\infra\jre\bin\java.exe" -jar cdata.jdbc.<Driver Name>.jar

    Press Enter again.

    Example:

    The Connection String Builder opens.

  3. Enter the values for the following connection properties (click in the Value column to enter a value or to modify an existing value):

    • Server: Set this to the host name or IP address of the server hosting the SparkSQL database.
    • User: Set this to the username used to authenticate with SparkSQL.
    • Database: Set this to the name of the SparkSQL database.
    • Password: Set this to the password used to authenticate with SparkSQL.
  4. If the Connection String Builder has a InitiateOAuth property, set it to OFF to avoid entering the OAuth Authorization process.

    Note:

    This property may not appear for some connectors.

  5. Press Enter to add all the connection properties to the connection string.

    Example:

    jdbc:sparksql:Server=127.0.0.1;

  6. Click Test Connection. A new browser tab opens where you need to log in to your application in order to grant access. (Each application will display a different window and messages.)

    Close the Authorization Successful! message that opens.

  7. Go back to the Connection String Builder dialog, and click OK in the Test Connection Successful message to close it.

  8. Click Copy to Clipboard to obtain the connection string.

For a short video of the process, see below (the video uses the XML driver as an example):

You are required to complete the above instructions only on first connect, and again when your credentials to the application change.
To help you create a connection string and test the connection, see Connection String Builder for Certified Connectors.

If you have any issues connecting to your data source, see Troubleshooting JDBC Data Connectors.

Adding SparkSQL Tables to your ElastiCube

  1. Open Sisense. (For a non-local installation, open Sisense on the hosted cloud environment.)
  2. In the Data page, open an ElastiCube or create a new ElastiCube.

  3. In the Model Editor, click . The Add Data dialog box is displayed.

  4. Click Generic JDBC to open the JDBC settings.

  5. In Connection String, paste the string you obtained above.
  6. In JDBC JARs Folder, enter the name of the directory where the SparkSQL JAR file is located (see Deploying the SparkSQL JDBC Driver).
  7. In Driver's Class Name, enter the following class name: cdata.jdbc.sparksql.SparkSQLDriver.
  8. If you wish to secure the connection, enter your SparkSQL credentials in User Name and Password and remove the relevant properties from the connection string. Otherwise, leave these fields blank.
  9. Click Next. A list of tables in the database are displayed. All tables and views associated with the database will appear in a new window.
  10. From the Tables list, select the relevant table or view you want to work with. You can click next to the relevant table or click Preview to see a preview of the data inside it.
  11. (Optional) Click + to customize the data you want to import with SQL. See Importing Data with Custom Queries for more information.
  12. After you have selected all the relevant tables, click Done. The tables are added to your data model.

For a short video of the process, see below (the video uses the XML driver as an example):

SparkSQL Connector: Additional Resources

For the full documentation set for the SparkSQL connector, click here.

For connection string options, click here.

For information about the SparkSQL data model, click here.