Step by step Analytic models creation and usage

Follow these simple steps to create your Self-Service BI 'Analytic Model' based on your data in minutes. Kyubit analytic models are designed to be created with the minimum of user interaction, easy to comprehend and utilize.


Create Model from CSV File

Data stored in CSV file, delimited by a semicolon (;), can be quickly uploaded to Kyubit application and immediately is ready for the step of 'Data Definition'.
Numeric and date values in CSV file should respect current Kyubit BI server regional settings format for decimal separator and date format.



Create Model from Query Results

If the data for the analytic model is based on existing data from relational databases, create a SQL query that will be used to retrieve data for a new model. Any valid SQL query could be used to run against SQL Server or ODBC data sources registered in the Kyubit application.



Data Definition

When a source for the new analytic model is defined (CSV file or query), the next step is to define the role of each column in the provided data set on the 'Data' tab. Possible definition for the columns are 'Measure', 'Dimension', 'Date' and 'Details'.

Measure

The measure is a value from our dataset that will be used for aggregations while analyzing the analytic model. Typically, using Sum, Avg, Count, and other aggregations. The measure has to be of numeric values in Kyubit Analytic Model.

Dimension

Dimension values are used to slice and analyze measure values, used in the required context. Analytic results are based on measures and dimension members on categories and series.

Date

Date column could be used as a filter in the analysis. Because analyzing business data mostly include filtering based on certain period, this is an extremely useful element for analysis. The date cannot be used on analysis categories and series.

Details

Some information from the dataset are not good analytic material to be used on analysis categories and series, but should only be available when asked for details of aggregated data. For example, information such are 'address' and 'phone number', we are hardly going to use as aggregating data, but still we like to use as 'Details' on aggregated results. Setting such columns as 'Details' ensures a more compact analytic model, faster processing and smaller size on the disk.




Column Caption and Description

While defining data for the analytic model, optionally click on the column name and set its caption (if it should be different from source) and Column Description.

Format Values

Measure values should be provided as pure numeric value. To configure measure to be presented as formatted numeric value (Currency for example), click on the "Format values" below column name and choose one of the formatting options or write your own.

Organize Structures Additionally

When each column definition is defined, we can immediately proceed to the Processing of Analytic Model, but on the third tab 'Structure' there are additional options to organize analytic model structures. Instead to leave all dimension levels in 'Default dimension', we can create new dimensions and organize levels appropriately to be more comprehensive for end-users.

New Dimension

Create a new dimension based on some topic (for example, 'Customer' or 'Product') and assign appropriate analytic levels to this dimension. This way you group analytic levels to a certain topic and makes analysis more comprehensive.

New Hierarchy

When analytic levels are related to each other in parent-child relation, it is practical to organize them inside 'Hierarchies'. For example, Year-Month-Day or Continent-Country-City. This way makes analysis easy to drill-down data from higher view to more detailed values for end-user.




Processing of Analytic Model

While preparing the analytic model, a user can save and open the same analytic model many times, which is in 'Unprocessed' status. When all data and structures are prepared, click the 'Process Analytic Model' button on the 'Structure' tab to actually start processing of analytic model data and make it ready for analysis and visualizations. The process could take from a few seconds to several minutes (or more) depending on a number of rows and columns defined for the analytic model. All columns not required to be 'Dimension' set to 'Details' structure type, which will speed processing and save space in the internal 'Analytic Model' database.

Analytic Model Status

There are 3 analytic model statuses, 'Unprocessed', 'Processing' and 'Processed'. While the analytic model is processing, it cannot be used by any user. If the processing fails for any reason, it reverts to 'Unprocessed' status.

Log

After processing of analytic model, details of processing could be inspected by clicking on the 'Log' button in 'General' tab. If the process of the analytic model has failed, this is a good place to start troubleshooting for the possible cause of the problem.

Schedule model updates

The analytic model could be updated automatically with 'Scheduled Jobs'. Go to Schedule -> Jobs and create new Job 'Process Analytic Model' and time preference for updates to occur. If the analytic model is based on a query, the same query will be run against the defined data source to bring fresh data into the Analytic model. If the Analytic model is based on a CSV file, schedule job can be created only if the CSV file is uploaded from the shared folder and path begin with "\\..." (For example, \\SomeMachine\FolderWithData).





Set Offline / Online

After Analytic model is processed, it is ready to be used. For any reason, a user with permissions could take the Analytic model to 'Offline' and 'Online' status and manage its availability to end-users.


Usage in Analysis and Dashboards

After Analytic model is processed, it is ready for analysis and visualizations. Create analysis and reports in grid/chart view by adding measures and dimensions on categories/series axis or slicer axis. Use ordering, aggregate, isolating and other analytic actions to prepare the analytic report of your interest. Once the analysis is prepared it could be added to Dashboard and be visualized by many charts and visualization widgets, while at the same time data could be further analyzed by the end-user working with the dashboard (Drill-down, drill-through, drill-by, slicing, and other useful analytic actions).


Processing Performance with larger sets

To describe a relation between the amount of data and processing time, we prepared a series of test on various machines to get an aggregated impression of the relation between a number of rows and average processing time. For test purposes, we used Microsoft Contoso BI Demo Dataset for Retail Industry sample database available to download.

Processing was done every time with 5 measures and 15 dimensions.