Product Documentation

Select a Product:

Find Text:

Submit a question:

Not seeing an answer to a question you might have? You can ask it here and we'll try to update the documentation to address it. If you're making a feature request or reporting a problem, please create a support item for it, instead!

Auto-Cluster

Purpose

The Auto-cluster visual allows you to quickly identify possible relationships and similarity in your semantic model data, without having to resort to using R or Python: it's largely drag-drop-and-click! It uses an algorithm based on "Partition Around Medoid" (PAM), with extensions allowing one to weight features independently and search for parameters that yield more effective clustering. This is considered an example of "unsupervised machine learning." You can learn more by reading about "k-medoids", "PAM", and "clustering".

Data Bucket

The data requirements for the visual include 3 bucket fields, listed here:

Field	Type	Description
Feature Source	Number, Date, Text	Features are measurable properties that you wish to use in finding clusters. Pick values which you feel may be related.
Primary Measure Source	Number, Date	This can be a single scalar value that is used to size the inner-most circles, in the circle pack display style. If omitted, the size will be determined based on the "distance from medoid" for each group.
Predefined Categories	Text	If provided, applies groupings which manifest as nested circles in the circle packing style.

Settings

Clustering Settings

Field	Default	Description
Enable Clustering	On	When disabled, clustering is not performed and the visual reverts to simple circle packing of predefined categories only. (in 1.1)
Cluster Naming	Build from Medoids	Controls how clusters are named.
Cluster Name Separator	;	When cluster names are automatically constructed, serves as the separator between terms. Can use \\n for a new line.
Cluster Name Prefix		Becomes an optional prefix for generated cluster names. For example the prefix of 'Similar to' could generate a cluster name of 'Similar to ABC' if ABC were emitted as the cluster name based on the 'Cluster Naming' rule setting.
Cluster Name Mapping		An optional way to set the name of clusters. Provide one mapping per line, of the format 'find\|target' (or 'find,target') where 'find' is text that is checked against the generated cluster name (or failing that, data within the cluster) and if there is a case-insensitive match, the cluster name becomes 'target'. Note: whole-word matching is used.
Ordinal Identification		For features that are categorical and represent ordinals, one can specify the specific ordering for the ordinals. The format is to have one line per ordinal feature, of the form 'featurename\|val1\|val2\|...'. Feature name should match what shows in the data well, and values are weighted based on the sequence. For example 'Rating\|Strongly Disagree\|Disagree\|Neutral\|Agree\|Strongly Agree' would apply '0' for 'Strongly Disagree', '0.25' for 'Disagree', and so on, for the 'Rating' feature (assuming normalized).
Aggregation Mode	Take First	If the source data includes multiple rows per lowest-level grouping, controls how those rows are handled for output.
Row ID Prefix		When the Aggregation Mode is 'Add Row ID', sequential numbering is used to assign row identifiers, ensuring uniqueness of source data rows. This setting allows for a prefix such as 'Row' to be appended to the numbering.
Solution Searching	None	Controls whether solution searching is enabled. If not, you can provide an explicit 'k' (cluster count), or a heuristic rule will pick a number of clusters based on your row count.
Solution Searching Button	None	Controls whether a solution searcher button will be available for end-users to perform searching on-demand.
Timeout (seconds)	20	The number of seconds that the solution finder can run for before it times out. Increasing this number means it can run for longer to find possibly better solutions.
Searching: Iterations		By default, the solution searcher determines the number of iterations it uses based on your row count, but you can override that value here. Providing a small value increases performance at the expense of the solution quality.
Cluster Count		You can explicitly provide the number of clusters you want to use. If using the solution finder, 'k' can be searched for, otherwise the count will be inferred using a simple rule based on your row count. A value less than '2' implies no value is set for this.
Maximum Cluster Count	20	When solution searching is enabled, this value serves as the maximum number of clusters that can be included in a solution.
Cluster Position	1	Controls at what level the inferred cluster is shown. '1' places it as the first-level grouping and successive numbers nest it deeper. A number higher than the available number of groupings makes it the deepest grouping possible.
Default Weighting	Leave Raw	Controls how each feature is weighted, with respect to other features. This allows some distances to be emphasized and others deemphasized. 'Normalize' places all values on a [0,1] scale; 'Raw' uses numbers exactly as they are in the source data; 'Manual' requires use of the 'Manual Weights' setting.
Manual Weighting		When the weighting rule is set to 'Manual Override', this should be a comma-separated list of weighting factors (typically between 0 and 1) that should apply to each of the Feature Source columns, in order.
Maximum Rows	20000	This is a global limit on the number of rows allowed to process. This helps ensure performance does not accidentally degrade, although you can change this to higher (or lower) values as required (at your own risk).
Allow Timeout Warning	On	When enabled, if the solution finder times out, a warning will be shown. When disabled, no warning is shown and the best solution available will be shown.
Case Must Match	On	When disabled, case is not required to match (allowing for the possibility of merging values).
Keep Whitespace	Off	When enabled, whitespace must match whitespace between compared values, for merge purposes (regardless of 'Ignore Non-alphanumeric' setting).
Diacritic Neutral	Off	When enabled, internally compares diacritic (accented) characters as if ASCII characters, for merge purposes.
Allow Tooltips	On	When enabled, tool-tips may be shown when hovering over data elements.

Circle Packing

Field	Default	Description
Keep Square Layout	Off	When enabled, the layout will be treated as a square even if the visual is laid out as a rectangle; in a practical sense this means you may see a vertical scroll bar to move up and down through the layout, but there also can be less horizontal whitespace.
Text Font		The font family for text labels.
Text Size		The size of the font for text labels.
Text Color		The text color for text labels.
Level 1: Color Like Values	Off	When enabled, level 1 grouping will color circles based on like values.
Level 2: Color Like Values	Off	When enabled, level 2 grouping will color circles based on like values.
Level 3: Color Like Values	Off	When enabled, level 3 grouping will color circles based on like values.
Start Color		The color used for the background. It is combined with the ending color with successive levels of grouping moving closer to the ending color. If omitted, becomes a lighter version of the ending color.
End Color		The innermost level of grouping is influenced by this color. If omitted, becomes a darker version of the starting color.
Value Background		The background color for value (smallest) circles.
Medoid Background		The background color for value circles, which are also medoids (the 'central' item identified, per cluster). Overrides Value Background color. When omitted, medoids are not colored differently.
Quick Find Highlight		The fill color to use for circles where there is a quick find match. (in 1.1)
Show Zoom In/Out	Off	When enabled, zoom in and out buttons/links are shown to support zooming within the currently displayed layout level.
Alt Click to Zoom (vs Shift)	On	When enabled, zooming to clicked areas is accomplished by holding down the Alt key and left-clicking the mouse. This is the default behavior, allowing other click types to be their default Power BI behavior, to support cross-filtering. Disabling this option makes Shift be used for Zoom click operations.
Label Line Length	60	The general maximum number of characters per label line.

Color Table

Field	Default	Description
Sort Criteria	Primary Measure Average / Alphanumeric	Controls how sorting is applied to the color table. If no primary measure is provided, that sort is ignored.
Sort Ascending	On	When enabled, the Sort Criteria is applied in ascending order - otherwise descending order is used.
Show Values	All Features	Controls which values are shown, per row.
Show Captions	On	When enabled and showing feature values, captions will be added for each feature, repeated once for each main grouping level.
Max Column Width		Normally the visual tries to calculate a column width based on rules, but you can override this percentage setting here.
Cluster Name Font		The font family for cluster names.
Cluster Name Text Size		The size of the cluster name text.
Cluster Name Color		The text color for the cluster name text.
Group Value Font		The font family for grouping values.
Group Value Text Size		The size of the group value text.
Grouping Value Text Color		The text color for the grouping value text.
Feature Value Font		The font family for feature values.
Feature Value Text Size		The size of the feature values text.
Feature Value Text Color		The text color for the feature values text.
Primary Measure Font		The font family for the primary measure, as shown in the feature values list.
Primary Measure Text Size		The size of the primary measure text, as shown in the feature values list.
Primary Measure Text Color		The text color for the primary measure text, as shown in the features values list.
Feature Caption Font		The font family for the feature caption (if enabled), as shown above the feature values list.
Feature Caption Text Size		The size of the feature caption text (if enabled), as shown above the feature values list.
Feature Caption Text Color		The text color for the feature caption text (if enabled), as shown above the features values list.
Medoid Font		The font family for the feature values, specifically when the row in question is for the medoid data point.
Medoid Text Size		The size of the feature values, specifically when the row in question is for the medoid data point.
Medoid Text Color		The text color for the feature values, specifically when the row in question is for the medoid data point.
Medoid Background Color		The background color for the feature values, specifically when the row in question is for the medoid data point.

General Formatting

Field	Default	Description
Display Style	Circle Packing	Identifies the primary display style for the visual.
Summary Stats	Hide	Controls if/where summary stats are shown, on the layout surface. This is most useful when dealing with non-standard weightings, determined through the solution finder.
Quick Find	Hide	Controls if/where the quick find text box is shown, on the layout surface. (in 1.1)
Quick Find Minimum Length	2	The minimum number of characters needed in the quick find before a match is attempted. (in 1.1)
Show Links	Off	When disabled, buttons are used instead of links for various commands.
Show Copy Clipboard	Off	When disabled, the copy data to clipboard link will be hidden. Data is copied in the format defined by the ** setting. Data includes the inferred cluster, which is NOT available in the source data. Note: copy functionality varies based on browser used.
Copied Message Duration	4	The number of seconds to display the 'copied data to clipboard' message. Use zero to show no message.
Show Counts	Off	When enabled, the number of rows loaded is shown at the bottom of the visual.
Show Tips	On	When enabled, tips are shown below the main display area. Note: this can only be disabled when NOT inside Power BI Desktop OR when using an organization license key.
Export: Cluster Name	ClusterName	When 'copying to clipboard', the column header value for the calculated name for the cluster. A blank value will omit the field.
Export: Row ID Name	RowID	When 'copying to clipboard', the column header value for the row ID which is assigned as data is loaded into the visual. A blank value will omit the field.
Export: Is Medoid	IsMedoid	When 'copying to clipboard', the column header value for the indicator of whether a given row is for a medoid. A blank value will omit the field.
Export: Distance from Medoid	MedoidDistance	When 'copying to clipboard', the column header value for the calculated distance for the given row, from the medoid of the cluster. A blank value will omit the field.
Export: Weights Suffix		When 'copying to clipboard', additional columns can be added for each feature, with the column value being the feature's weight. The column name for weights will be the feature name, followed by this setting. A blank value will not include weights.
Export: Normal Suffix		When 'copying to clipboard', additional columns can be added for each feature, with the column value being the feature's normalized value. The column name for the normalized values will be the feature name, followed by this setting. A blank value will not include normalized values.
Counter Text Size		The size of the counter text.
Counter Text Font		The font family for the counter text.
Loaded Text	loaded	Text to show for the 'loaded' count text.

Site License and Diagnostics

Field	Default	Description
Licensed By		Provide the name that the product was registered with, as found in the confirmation email.
Site License Key		For site licensing (not from AppSource). Provide the 22 character license key, as found in the confirmation email. (Note: a reload of the report is needed after the key and licensee is set.)

Other Resources / Links:

AppSource Purchase Guide

Learn more about purchasing one or more user licenses from Microsoft AppSource.