Tutorial / Cram Notes
Creating a Keyword Dictionary
To create a keyword dictionary in the Microsoft 365 compliance center, you would follow these general steps:
- Navigate to the Microsoft 365 compliance center.
- Go to ‘Data loss prevention’ and select ‘Sensitive info types’.
- Click on ‘Create’ to start the process of building a new sensitive info type.
- In the creation process, you need to give your sensitive info type a name and description.
- Choose ‘Use a dictionary’ and upload a CSV file containing your list of keywords.
- Once uploaded, you can set confidence levels and match accuracies to finetune how these terms will trigger DLP policies.
For example, if you are tasked with safeguarding personally identifiable information (PII), your keyword dictionary may contain terms like “social security number,” “passport number,” “credit card number,” etc.
CSV File Format
Your CSV file must be structured to accurately import keywords into your dictionary:
Keyword |
Keyword 1 |
Keyword 2 |
Keyword 3 |
… |
Ensure there is no header row and each keyword is in a new line.
Using a Keyword Dictionary
With the keyword dictionary created, you would then integrate it into DLP policies:
- Still, in the data loss prevention section, create or modify an existing DLP policy.
- In the policy creation or edit-flow, select ‘Custom sensitive info types’ and choose the dictionary you created.
- Define the rules and conditions under which detections of these keyword matches will result in alerts, notifications, or other defined protective actions.
For instance, if the policy is to protect against data leakage of financial data, a rule may be set that any document containing keywords from the financial data dictionary must not be shared externally.
Effective Keyword Dictionary Usage
Effectiveness comes from being specific and comprehensive:
- Scope Definition: Clearly define what constitutes sensitive information in your context.
- Keyword Selection: Balance the specificity of keywords against potential false positives.
- Regularity of Updates: As your organizational needs evolve, continuously update your keyword lists.
- Testing and Refinement: Regularly test the dictionary against sample data to gauge its accuracy and make necessary refinements.
Benefits of a Keyword Dictionary
Utilizing a keyword dictionary can streamline the process of identifying sensitive data because it:
- Provides an organization-specific approach to DLP.
- Enhances the precision of data detection with tailored terms.
- Minimizes the chances of leaks by automating the discovery of sensitive information.
Challenges and Solutions
One of the challenges with keyword dictionaries is maintaining a balance between being comprehensive enough to catch all instances of sensitive information, while not being so broad that it results in numerous false positives. To combat this:
- Use additional context or supporting evidence rules, such as proximity to other sensitive information or location within a document.
- Implement regular expression patterns alongside keywords for more complex detections.
In summary, creating and using a keyword dictionary is a strategic approach to customizing how sensitive information is identified and protected in your organization. Through thoughtful integration with DLP policies, a keyword dictionary becomes a powerful tool for Microsoft Information Protection Administrators to meet the security and compliance requirements of their organization.
Practice Test with Explanation
T/F: A keyword dictionary in Microsoft 365 can only contain words, not phrases or patterns.
False
A keyword dictionary in Microsoft 365 can include words, phrases, and even patterns using regular expressions to match specific data types.
T/F: To create a keyword dictionary, you must have at least one item in the list before it can be saved and used.
True
You need to have at least one keyword or value entered to create a keyword dictionary; otherwise, it cannot be saved and utilized in policies.
T/F: Keyword dictionaries are case-sensitive by default.
False
By default, keyword dictionaries are not case-sensitive, although you can configure them to be case-sensitive if needed.
Which of the following can be used to implement a keyword dictionary in Microsoft 365 compliance solutions? (Select all that apply)
- A) Data loss prevention (DLP) policies
- B) Sensitivity labels
- C) Communication compliance
- D) Information governance
A, B, C
Keyword dictionaries are utilized in Data loss prevention (DLP) policies, Sensitivity labels, and Communication compliance to identify and classify data.
When using a keyword dictionary in a DLP policy, what kind of data can you protect?
- A) Data at rest
- B) Data in use
- C) Data in motion
- D) All of the above
D
A keyword dictionary can be used in DLP policies to protect data at rest, data in use, and data in motion by identifying sensitive information across these states.
T/F: A keyword dictionary can be imported from a .csv file.
True
You can import a keyword dictionary from a .csv file which allows for bulk addition of terms and efficient management of the dictionary.
How many terms can a single keyword dictionary in Microsoft 365 contain?
- A) 10,000
- B) 50,000
- C) 100,000
- D) Unlimited
B
A single keyword dictionary in Microsoft 365 can contain up to 50,000 terms.
T/F: Keyword dictionaries are automatically shared across all tenants in Microsoft
False
Keyword dictionaries are tenant-specific and not shared across different tenants in Microsoft 365, ensuring the privacy and security of each tenant’s data.
Which role must you have assigned to create a keyword dictionary in Microsoft 365?
- A) Compliance administrator
- B) Global administrator
- C) User administrator
- D) Both A and B
D
To create a keyword dictionary, you must have the role of either a Compliance administrator or a Global administrator.
T/F: Once a keyword dictionary is created, it can be deleted but not edited.
False
Keyword dictionaries can be both edited and deleted after creation. You can update the terms as needed to reflect changes in your organization’s data identification needs.
T/F: Keyword dictionaries support the use of international characters, such as letters with accents or characters from non-Latin scripts.
True
Keyword dictionaries do support international characters, allowing organizations to cater to a multi-lingual environment and recognize terms in different languages.
For which of the following scenarios would you need to create a keyword dictionary? (Select all that apply)
- A) Preventing the accidental sharing of confidential project names
- B) Blocking social security numbers from being transmitted via email
- C) Protecting trade secrets mentioned in documents
- D) Enforcing a legal hold on documents containing specific litigation-related terms
A, C, D
Keyword dictionaries are suitable for preventing the accidental sharing of confidential project names, protecting trade secrets, and enforcing legal holds on documents with specific litigation-related terms. Social security numbers would typically be protected using predefined sensitive information types rather than a keyword dictionary.
Interview Questions
What is a keyword dictionary in Microsoft 365’s Information Protection feature?
A keyword dictionary is a list of terms that are used to identify sensitive information within digital documents.
How does a keyword dictionary work in Microsoft 365?
A keyword dictionary works by identifying and classifying sensitive information in emails, documents, and other types of digital assets based on a pre-defined list of keywords.
What are the benefits of creating and using a keyword dictionary in Microsoft 365?
The benefits of creating and using a keyword dictionary in Microsoft 365 include improved compliance, enhanced protection, and improved efficiency.
How can an organization create a keyword dictionary in Microsoft 365?
An organization can create a keyword dictionary in Microsoft 365 by going to the “Data classification” page and selecting “Keyword dictionaries,” then clicking “Create a keyword dictionary” and adding the terms that should trigger the classification of sensitive information.
Can an organization customize the keyword dictionary to suit their specific needs?
Yes, organizations can customize the keyword dictionary to suit their specific needs.
How can an organization use a keyword dictionary in Microsoft 365 to identify sensitive information?
An organization can use a keyword dictionary in Microsoft 365 to identify sensitive information by selecting “Keyword” as the method for identifying sensitive information when creating a sensitive information type, and choosing the appropriate keyword dictionary.
What are some tips for reviewing and testing a keyword dictionary in Microsoft 365?
Tips for reviewing and testing a keyword dictionary in Microsoft 365 include ensuring that the dictionary accurately identifies the sensitive information and refining the dictionary as needed.
Can a keyword dictionary be used in conjunction with other security measures?
Yes, a keyword dictionary can be used in conjunction with other security measures to protect sensitive information.
How can employees be trained on the use of a keyword dictionary in Microsoft 365?
Employees can be trained on the use of a keyword dictionary in Microsoft 365 through workshops, online training, and regular communication.
Can a keyword dictionary be used to identify sensitive information in non-English documents?
Yes, a keyword dictionary can be used to identify sensitive information in non-English documents.
What is the process for refining a keyword dictionary in Microsoft 365?
The process for refining a keyword dictionary in Microsoft 365 involves reviewing the results, modifying the dictionary as needed, and retesting the dictionary.
Can a keyword dictionary be shared with other organizations?
Yes, a keyword dictionary can be shared with other organizations.
What is the difference between a keyword dictionary and document fingerprinting?
A keyword dictionary is a list of pre-defined terms used to identify sensitive information, while document fingerprinting is a way to identify and protect sensitive information based on a unique “fingerprint” of a document.
Can a keyword dictionary be used to identify sensitive information in images or PDFs?
No, a keyword dictionary cannot be used to identify sensitive information in images or PDFs.
What are some best practices for creating and using a keyword dictionary in Microsoft 365?
Best practices for creating and using a keyword dictionary in Microsoft 365 include testing the dictionary, refining the dictionary as needed, and training employees on its use.
Great post, I really appreciate the detailed explanation on creating a keyword dictionary!
I’m having trouble understanding the difference between sensitive info types and the keyword dictionary, can anyone help?
Thanks for the blog post!
Can anyone share examples of when they’ve used a custom dictionary successfully?
Is it possible to update the keyword dictionary dynamically?
The blog was somewhat helpful, but I think including more examples would make it more effective.
Can the keyword dictionary be imported from a file?
How do keyword dictionaries affect the DLP policies performance?