> For the complete documentation index, see [llms.txt](https://docs.veza.com/4yItIzMvkpAvMVFAamTf/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.veza.com/4yItIzMvkpAvMVFAamTf/features/insights/role-definitions-how-to-guide.md).

# Role Definitions

## Overview

**Role Definitions** is an automated tool that generates optimized role assignments for your organization based on existing user permissions. It uses a deterministic algorithm to analyze patterns in how users access resources and creates a minimal set of distinct roles that can replace individual permission assignments.

## Purpose

This feature helps organizations:

* **Simplify access management** by reducing the number of individual permission assignments
* **Identify permission patterns** across users and resources
* **Create role-based access control (RBAC)** from existing permission data
* **Optimize role assignments** to minimize the number of distinct roles needed

## How It Works

Role Definitions analyzes uploaded permissions data using an algorithm that:

1. Analyzes user-resource-permission mappings from your CSV data
2. Identifies common permission patterns across users
3. Generates a minimal set of distinct roles that cover all permission requirements
4. Assigns appropriate roles to each user based on their permission needs

## Step-by-Step Guide

### Step 1: Prepare Your CSV File

Create a CSV file with the following structure. Each row represents a single user-resource-permission relationship, showing what permissions a user has on a specific resource.

```csv
user_id,resource_id,permissions_for_resource
u1,table_1,"read,write"
u2,table_2,"write,truncate,drop"
u3,table_1,"drop,delete"
u4,database_1,admin
u1,table_2,"write,truncate"
```

**Understanding the data model:**

* Each row defines which permissions a user has on a specific resource
* The same user can appear in multiple rows for different resources (e.g., `u1` appears twice above)
* The same resource can appear in multiple rows for different users
* The algorithm analyzes these relationships to identify common permission patterns and generate optimal roles

**CSV Format Requirements:**

The file must follow standard CSV format (RFC 4180 compliant):

* **Headers (required, case-sensitive):** `user_id`, `resource_id`, `permissions_for_resource`
* **All fields are required:** Every row must have all three columns with non-empty values
* **Empty rows:** Automatically skipped during processing

**Field Specifications:**

* **user\_id:** Unique identifier for each user (e.g., username, email, or ID)
  * Must be a non-empty string
  * Will be used to identify users in the generated role assignments
* **resource\_id:** Identifier for the resource being accessed (e.g., table name, database name, file path)
  * Must be a non-empty string
  * Should use consistent naming across your organization
  * Can be as simple as `table_1` or as complex as Azure ARM paths like `/subscriptions/1111/resourceGroups/rg-app/providers/Microsoft.Web/sites/api-prod`
* **permissions\_for\_resource:** Comma-separated list of permissions for this resource
  * Must be a non-empty string
  * Use quotes when permissions contain commas: `"read,write,delete"`
  * Single permissions can be unquoted: `admin`
  * Multiple permissions without internal commas can be unquoted: `read,write`
  * Permissions can be descriptive: `Storage Blob Data Reader`, `SQL DB Contributor + Execute`, or `"Get Secret,Set Secret"`

**Example (Azure resources):**

```csv
user_id,resource_id,permissions_for_resource
az_a1,/subscriptions/1111/resourceGroups/rg-analytics/providers/Microsoft.Storage/storageAccounts/saingest,Storage Blob Data Reader
az_a2,/subscriptions/1111/resourceGroups/rg-analytics/providers/Microsoft.Storage/storageAccounts/saingest,Storage Blob Data Reader + List
az_a1,/subscriptions/1111/resourceGroups/rg-analytics/providers/Microsoft.Storage/storageAccounts/saingest/blobServices/default/containers/raw,"Read,Write,List"
az_c1,/subscriptions/1111/resourceGroups/rg-sec/providers/Microsoft.KeyVault/vaults/kv-prod,"Get Secret,Set Secret"
az_c2,/subscriptions/1111/resourceGroups/rg-sec/providers/Microsoft.KeyVault/vaults/kv-prod,"Get Secret,Set Secret,Delete Secret"
```

### Step 2: Navigate to Role Definitions

1. Open the Veza platform
2. Navigate to **Access Intelligence** (in the Products section of the navigation sidebar)
3. Click on **Role Definitions** in the top navigation bar

   ![Role Definitions landing page showing upload interface](/files/YdUEfTFw41jJeBVGK4F4)

### Step 3: Upload Your CSV File

1. On the "Uploaded Data" tab, you'll see an upload interface
2. Click the **"Upload CSV File"** button
3. Select your prepared CSV file
4. The system will parse and validate your file

After uploading, the CSV is parsed and validated for the correct format. Each row is converted into a user-permission mapping. The data is displayed in a preview table for verification.

### Step 4: Review Uploaded Data

After a successful upload, you will see a preview table showing all uploaded entries, a search box to filter the data, and the total number of uploaded items.

![Uploaded data preview showing CSV entries in table format](/files/j4p9WWNdEP0ICvgv8gBs)

**Review checklist:**

* ✓ Verify all users are present
* ✓ Check that resource IDs are correct
* ✓ Confirm permissions are properly formatted
* ✓ Look for any duplicate or conflicting entries

### Step 5: Generate Role Assignments

1. Click the **"Generate Role Assignments"** button
2. The system will process your data (this may take a few moments)
3. You'll be automatically switched to the "Generated Role Assignments" tab

**What happens during generation:**

* The algorithm analyzes permission patterns across all users
* Common permission sets are identified and grouped using frequent subset analysis
* Optimal roles are created to minimize redundancy
* Each user is assigned appropriate roles based on their permissions

### Step 6: Review Generated Roles

The Generated Role Assignments tab shows:

* **Role Assignments:** Which roles each user has been assigned
* **Distinct Roles:** The unique roles created and their permissions
* **Statistics:**
  * Number of options explored during optimization
  * Number of distinct input permissions
  * Number of distinct output permissions

**Understanding the results:**

* Each distinct role has a generated name (e.g., "role\_1", "role\_2")
* Roles contain specific permission sets for resources
* Users may be assigned multiple roles to cover all their permissions

![Generated role assignments grouped by user](/files/QnpHFZUGNRhSN63GHfaf)

You can expand individual users to see detailed role assignments:

![Expanded user view showing role details and permissions](/files/OVckiovoPhdek6i9kp90)

You can also switch to view results grouped by role to see which users are assigned to each role:

![Generated role assignments grouped by role](/files/whodhNO15msLU3EYiHHm)

### Step 7: Export Results

1. Click the **"Export"** button on the Generated Role Assignments tab
2. The results will be downloaded as a CSV file named `generated-role-assignments.csv`

**Export file format:**

The exported CSV contains three columns:

* **User ID**: The unique identifier for each user
* **Assigned Roles**: Comma-separated list of role names assigned to the user (e.g., `role_1,role_2,role_3`)
* **Assigned Role Definitions**: JSON object mapping each role name to its permissions
  * Format: `{"role_1": ["permission1", "permission2"], "role_2": ["permission3"]}`
  * Contains the complete permission definitions for all roles assigned to that user

**Example export:**

```csv
User ID,Assigned Roles,Assigned Role Definitions
u1,"role_1,role_2","{""role_1"":[""read"",""write""],""role_2"":[""admin""]}"
u2,role_1,"{""role_1"":[""read"",""write""]}"
```

Each row represents one user with all their assigned roles and the complete permission definitions for those roles.

Role definitions use a deterministic role mining algorithm based on frequent subset analysis. The algorithm:

* Analyzes permission patterns to find common permission sets across users
* Uses a greedy approach to minimize the number of distinct roles
* Produces consistent results with the same input data
* Optimizes for minimal role count while maintaining complete permission coverage

## Common Use Cases

### 1. Database Access Management

Convert individual table/database permissions into database roles:

* Input: Users with various read/write/admin permissions on tables
* Output: Roles like "DatabaseAdmin", "TableReader", "TableWriter"

### 2. Application Access Control

Transform application-specific permissions into application roles:

* Input: Users with different feature access in an application
* Output: Roles like "BasicUser", "PowerUser", "Administrator"

### 3. File System Permissions

Consolidate file and folder permissions into file system roles:

* Input: Users with various read/write/delete permissions on folders
* Output: Roles like "ReadOnlyUser", "Contributor", "Owner"

## Best Practices

1. **Clean your data first:** Remove test users and obsolete permissions before uploading
2. **Use consistent naming:** Ensure resource IDs and permission names are standardized
3. **Start with a subset:** Test with a small group of users first
4. **Review carefully:** Verify generated roles make sense for your organization
5. **Document changes:** Keep track of which roles replace which individual permissions

## Troubleshooting

### Common Issues

**CSV Upload Fails:**

If you see an error during CSV upload, check the error message:

* **"Invalid CSV format. Please ensure the file has the correct headers."**
  * Verify all three required headers are present: `user_id`, `resource_id`, `permissions_for_resource`
  * Headers are case-sensitive and must match exactly
  * Ensure the file is saved as CSV format (not Excel)
* **"Invalid CSV format. Please ensure all rows have the required fields."**
  * Check that every row has values in all three columns
  * Ensure no cells are empty (all fields must be non-empty strings)

**No Roles Generated:**

* Ensure you have uploaded data before clicking generate
* Check that permissions are properly formatted
* Verify there are actual permission patterns to mine

**Too Many Roles Generated:**

* This may indicate highly unique permission sets
* Consider grouping similar resources together
* Review if all permissions are necessary

## Limitations

* Maximum CSV file size depends on system configuration
* Processing time increases with the number of unique permission combinations
* The algorithm optimizes for minimal roles, which may not always align with organizational structure

## Security Considerations

* Uploaded data is processed in memory and not permanently stored
* Generated roles should be reviewed before implementation
* Export files may contain sensitive permission information

## Next Steps

After generating role definitions:

1. Review and refine the generated roles
2. Map roles to your organization's structure
3. Implement roles in your access control system
4. Monitor and adjust as needed
5. Consider automating role assignment for new users


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.veza.com/4yItIzMvkpAvMVFAamTf/features/insights/role-definitions-how-to-guide.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
