Dax Function: SAMPLE
Category: Statistical Functions
The SAMPLE function in Power BI is a DAX function that retrieves a specified number of random rows from a table. It is particularly useful for creating randomized subsets of data for analysis, testing, or visualization purposes.
Purpose of the Function
To return a randomized subset of rows from a table based on a given sample size.
Type of Calculations
Randomized row selection without replacement.
Practical Use Cases
- Extracting smaller datasets for performance testing or prototyping.
- Generating randomized samples for statistical analysis.
- Visualizing a representative sample of a large dataset.
SAMPLE(<nvalue>, <table>, [<orderby_expression1>], [<order>])</order></orderby_expression1></table></nvalue>
| Parameter | Type | Description |
|---|---|---|
<nValue> | Integer | The number of rows to return from the table. |
<Table> | Table | The table from which rows will be sampled. |
[<OrderBy_Expression1>] | Scalar | (Optional) The column or expression used for sorting the table before sampling. |
[<Order>] | Boolean | (Optional) Specifies sorting order: ASC (ascending) or DESC (descending). Default is ASC. |
How Does SAMPLE Dax Works
Logical Process
Input Parameters: Accepts a table and the desired number of rows (
nValue).Sorting (Optional): If an
OrderBy_Expressionis provided, the table is sorted first.Row Sampling: The function then selects the first
nValuerows from the (optionally sorted) table.
What Does It Return?
Returns a table containing the randomly selected rows based on the specified sample size (
nValue).The output is deterministic if sorting (
OrderBy_Expression) is specified. Without sorting, the rows are selected arbitrarily.
When Should We Use It?
Prototyping: When working with a large dataset and needing a smaller subset to iterate quickly.
Data Sampling: To create representative datasets for statistical analysis or machine learning.
Performance Testing: To test queries or calculations on a smaller subset of data.
Examples
Basic Usage :
Retrieve 5 random rows from a table:
SAMPLE(5, Sales)
Output: A table with 5 randomly selected rows from the Sales table.
Column Usage
Sample rows while sorting by sales amount in descending order:
SAMPLE(10, Sales, Sales[Amount], DESC)
Use Case: Returns the top 10 rows based on Sales[Amount].
Advanced Usage
Sample rows dynamically based on a calculated sample size:
SAMPLE(
CALCULATE(SUM(Sales[UnitsSold]) / 10),
Sales,
Sales[Region],
ASC
)
Use Case: Dynamically samples 10% of the total rows, sorted by region.
Tips and Tricks
Ensure Reproducibility: Use sorting (
OrderBy_Expression) to ensure consistent results in scenarios requiring determinism.Performance Optimization: Avoid sampling very large tables unless necessary. Use filters to narrow down the dataset first.
Avoid Errors: Ensure
nValueis less than or equal to the total number of rows in the table.
Performance Impact of SAMPLE DAX Function:
Sampling large tables without sorting may lead to arbitrary results, impacting report consistency.
Use aggregated or summarized data to reduce computational overhead.
Related Functions You Might Need
TOPN: Retrieves the top N rows from a table based on a ranking expression.
FILTER: Filters rows from a table based on specified conditions.
ADD COLUMNS: Adds calculated columns to a table for enhanced sampling logic.
Want to Learn More?
For more information, check out the official Microsoft documentation for SAMPLE You can also experiment with this function in your Power BI reports to explore its capabilities.
Unlock the potential of Power BI to turn your data into powerful, actionable insights with help from our seasoned consultants. Whether it’s support with complex DAX calculations, designing interactive and easy-to-navigate dashboards, or enhancing the performance of your data models, our Power BI experts deliver personalized solutions tailored to your business needs. Check out our Power BI Consulting Services page to learn how we can assist your organization in making confident, data-informed choices.
It retrieves a specified number of rows from a table, optionally sorted by a given expression.
By default, the function selects rows arbitrarily unless sorting is specified.
TOPN retrieves the top N rows based on an expression, while SAMPLE is designed for random sampling.
The function returns all rows without errors.
Yes, you can use DAX measures or calculated expressions to define the sample size dynamically.