Working with a dataset

In this section, you will learn how to work with a dataset:

Creating a dataset

You can create a new dataset within a workbook:

  1. Open the workbook to create a dataset in.
  2. In the top-right corner, click Create and select Dataset.

Adding data

You can add data to a dataset from tables available in the source or run an SQL query against the source.

Note

You can use SQL queries for database sources only.

  1. In the dataset, go to the Sources tab.

  2. In the connection panel, click Add.

    screenshot

  3. Select a data source connection.

  4. Add data from the data source:

    Drag the tables you need from the table panel to the workspace. The system will automatically link these tables together based on the first matching field name and data type.

    Warning

    • The maximum number of tables in a single dataset is 32.
    • The maximum number of fields in a single dataset is 1,200.

    Note

    • To use subqueries as a source, in the connection settings, enable SQL query access levelAllow subqueries in datasets when creating or editing a connection.
    • Describing a dataset via a SQL query does not support parameters. However, you can use parameterized queries in QL charts.
    1. In the table panel, click Add.

      screenshot

    2. Enter a Source name and enter the SQL code in the Subquery field.

    3. Click Create.

      Example of SQL query

      A SQL query joins data from samples.MS_SalesFacts, samples.MS_Products, and samples.MS_Shops and groups them on the Store and the Product type fields. A sales total is computed for each group. At the same time, only sales records from 2019 are taken into consideration:

      SELECT 
          t3.ShopName AS "Store",
          t2.ProductCategory AS "Product type",
          COUNT(t1.OrderID) AS "Number of sales",
          SUM(t1.Price*t1.ProductCount) AS "Sales total"
      FROM
          samples.MS_SalesFacts t1
          INNER JOIN samples.MS_Products t2 ON t2.ProductID=t1.ProductID
          INNER JOIN samples.MS_Shops t3 ON t3.ShopID=t1.ShopID
      WHERE
          toYear(t1.OrderDatetime)='2019' -- condition for selecting sales for the specified year (2019)
      GROUP BY "Store", "Product type" -- group by Store and Product type
      ORDER BY "Store", "Product Type" -- sort by Store and Product type
      

Replacing a connection

To replace a connection in a dataset:

  1. In the dataset, go to the Sources tab.

  2. Delete all the tables from the workspace.

  3. In the connection panel, click next to the connection and select Replace connection.

    screenshot

  4. Select another connection.

  5. Drag the tables to be used as a data source for the dataset to the workspace.

  6. Edit or delete the calculated fields which use fields not present in the new table.

  7. In the top-right corner, click Save.

Note

If the data schema in the source changed, update the dataset fields.

You can join data from source tables.

Note

When you merge data from multiple tables, a dataset will get duplicates of the fields used to create links. You can remove them manually from the list of dataset fields. You can also describe the dataset using an SQL query and exclude duplicate fields from the final set.

  1. In the dataset, go to the Sources tab.

  2. Click the table link icon.

    screenshot

  3. Select the link type: inner, left, right, or full.

    screenshot

  4. Select the fields to join the tables on. You can only link fields with the same data type. DataLens gets information about the field's data type directly from the connection. If you need to join tables by fields with different data types, use one of these options:

    • Convert all the fields to the same data type at the source level.

    • Describe the dataset with an SQL query. Use the CAST or CONVERT functions to convert the data type. For example:

      SELECT * FROM lead INNER JOIN user ON lead.assigned_by_id = CONVERT(user.external_id, CHAR)
      
  5. If required, link other fields in the tables. To do this, click Add link.

  6. Optionally, you can disable the Optimize link option to make the link required. In this case, the JOIN operation will be performed even if you select fields from a single table.

  7. Click Apply.

By default, when you add a new table from a data source to the workspace, it is automatically linked to the first table listed there. If required, you can link a new table to a specific table. To do this, hover the new table over a previously added one and release when you see a gray border around the latter. A new link to the specified table is created.

screenshot

You can replace a previously added table. When dragging a new table, hover it over the icon with circular arrows to the right of the table you are replacing for the icon background to turn blue, then release.

screenshot

Note

If the updated table retains the same set of fields with the same names, the field IDs in the dataset will not change. In this case, the charts created based on this dataset will not be disturbed. If the updated table's schema has changed, update the dataset fields.

Creating fields

You can add fields to a dataset from data source tables or create calculated fields using formulas.

  1. In the dataset, go to the Fields tab.

  2. At the top of the screen, click Add field.

    screenshot

  3. Create a field:

    1. Enter a name for the field.

    2. Enter a formula to calculate the field value using data source functions.

    3. Optionally, enable Don't display in wizard to hide the field in the wizard when creating a chart.

    4. Click Create. The field will appear in the dataset.

      screenshot

    1. Enter a name for the field.

    2. Go to the Field from source tab.

    3. Specify the field parameters:

      • Field ID: ID of the field you are creating that is used to provide parameters for filtering by this field on the dashboard or in the chart URL. It may contain lowercase Latin letters, as well as the _ and - characters.
      • Source: Table in the data source with the field you need.
      • Source field: Field from the selected table in the data source.
      • Field type: Field data type.
      • Aggregation: Aggregation function applied to the field.
    4. Optionally, enable Don't display in wizard to hide the field in the wizard when creating a chart.

    5. Click Create. The field will appear in the dataset.

      screenshot

Configuring fields

You can perform the following operations on dataset fields:

  1. Rename a field: To rename a field, click its current name and enter a new one.

  2. Change the source: If the table schema in the source has changed, you can select a new field from the source to match the dataset field.

  3. Change the field data type: Data type determines how to process data and which functions can apply to the field. To change the data type, click the current type and select a new one from the list of available types.

  4. Select an aggregation type: Aggregation type determines the aggregation function to apply to the field. To change the aggregation type, click the current type and select a new one from the list of available aggregation types.

  5. Sort fields: You can sort fields by name, data type, aggregation type, and other parameters. To do this, use the sorting buttons at the top of the Fields table.

  6. Add a description: You can add a comment to a field in the Description column. This can help other users understand what the field is used for or how it is calculated.

  7. Hide fields: You can hide fields you do not need at the moment. Other users will not see these fields in the wizard when creating a chart. To do this, click .

    screenshot

Updating fields

You need to update fields if you changed the schema of a table (or tables) in the data source or manually deleted fields from a dataset.

To update fields in a dataset:

  1. In the dataset, go to the Fields tab.

  2. At the top of the screen, click Update fields.

    • If a field is missing in the dataset but present in the data source, you will see that field in the list.
    • If a field is present in both the dataset and the data source, the field will be updated, e.g., its type will change.
    • If a field is present in the dataset but missing in the data source, that field will be marked in red. Delete it from the dataset or replace it with a different field from the data source.

    screenshot

  3. In the top-right corner, click Save.

Replacing fields

If a field has been removed from a data source, it will not be updated when a dataset is updated. Charts featuring this field will show the ERR.DS_API.FORMULA.UNKNOWN_SOURCE_COLUMN error. You can delete this field from the chart or replace its source in the dataset.

To replace the field source in the dataset:

  1. In the left-hand panel, click Datasets and select the dataset you need.

  2. In the top-left corner, select the Fields tab.

  3. In the dataset, select the field you want to replace.

  4. Go to the Field source column.

  5. In the field settings, select the field you wish to replace the deleted field with from the Field from source list.

    screenshot

  6. Click Save.

Adding parameters

You can use parameters in calculated dataset fields.

Note

Parameters added at the dataset level are available in all charts created based on this dataset.

  1. In the dataset, go to the Parameters tab.

  2. Click Add.

  3. In the Add parameter window, specify:

    • Name: Sets the parameter name.
    • Type: Parameter data type.
    • Default value. This is a required value. It is used when no parameter value is set in the dashboard, chart URL, or chart settings.
  4. Click Add.

You can also create parameters at the chart level.

Adding filters

At the dataset level, you can add default filters that will automatically apply to any new chart created from the current dataset.

  1. In the dataset, go to the Filtering tab.
  2. Click Add.
  3. Select the field to apply the filter to.
  4. Set a filtering rule.
  5. In the bottom-right corner, click Apply filter.

Note

Default filters are not applied to data in the dataset preview area.

Deleting a dataset

Before deleting a dataset, check which objects are using it. To do this, at the top of the dataset, click Linked objects.

To delete a dataset:

  1. At the top of the dataset, click Delete.
  2. Click Delete.

See also