IBM VEST Workshops
25 min
Last updated 08/23/2024

Discovery and Classification

1 Introduction

This lab covers the Discovery and Classification functions that help manage the assets and sensitive objects in the data sources.

Discovery and Classification

Discovery and classification refer to locating and identifying objects in the environment that
must be tracked for security and compliance purposes.

Discovery is the process of finding important objects such as privileged users, sensitive
data, and data sources. Classification is the process of identifying what is discovered for
security and compliance purposes. These processes of discovery and classification are
important in large organizations where mergers, acquisitions, and existing systems
introduce new objects to the environment in unstructured or unpredictable ways. Guardium
helps you incorporate these objects into the environment so you can enforce effective
security policies and verify compliance.

A common scenario involves the discovery of sensitive data. Sensitive data refers to
regulated information like credit card numbers, personal financial data, social security
numbers, and other information that requires special handling. Guardium supports two
different approaches for discovering sensitive data: by using the Discover Sensitive Data
workflow builder, or by using the Policy Builder with other Guardium tools. The Discover
Sensitive Data workflow builder is an all-inclusive tool for establishing discovery and
classification processes for sensitive data. Use it to specify rules for discovery, define
actions to take on discovered data, specify which datasources to scan, distribute reports,
and run the workflow on an automated schedule. For more advanced users, the Policy
Builder supports more granular discovery and classification rules that can be easily
incorporated into existing processes and Guardium applications.

  • In the C200 GUI, sign in as labadmin.
  • On the navigation menu, go to Discover > Database Discovery > Auto-discovery Configuration.

Databases can exist undetected on the network and expose the network to potential risk.
Old databases might be forgotten and unmonitored, or a new database might be added as
part of an application package. A rogue DBA might also create a new instance of a
database to conduct malicious activity outside of the monitored databases.

  • Select the DB Discovery process and click Edit (Pencil).
  • Review the discovery configuration details. Note that it is possible to scan for open ports and probe for ports found open by the latest scan for DB services, however you will do both simultaneously.

Auto-discovery uses scan and probe jobs to verify that no database goes undetected in
the environment.

  • A scan job scans each specified host (or hosts in a specified subnet) and compiles a list
    of open ports that are specified for that host.

  • A probe job uses the results of the scan to determine whether database services are
    running on the open ports.You specify the hosts and ports the Auto-discovery process scans.

  • Scroll down and click Back.

Discovery finds only running databases. Databases need to be started if discovery is to be
used during the installation.

  • Select the DB Discovery process and click Run Once Now.

The Auto-discovery process is run on demand or on a scheduled basis.

  • To close the dialog, click OK.
  • Click Progress/Summary.

After you start or schedule a job, you can click Progress Summary to display the status of
this process.

  • Expand the host details, click Refresh until the process completes, and then click Close.

You can find more details for each host in the Auto-discovery process progress dialog.
Now, you can check the reports for the databases discovered.

  • On the navigation menu, go to Discover > Reports > Databases Discovered.
  • Review the report.

In the Databases Discovered report, each individual port that is discovered has its own row
in the report. This report has Time Probed, Server IP address, Server hostname, DB Type,
Port, Port Type (usually TCP), and a count of occurrences as the columns.
You can create custom reports with the Auto-discovery Query Builder.

  • On the navigation menu, go to Discover > Reports > Discovered Instances.
  • Review the report.

Note: If the report does not contain data, try changing the start date to July 1, 2024.

Another auto discovery report is Discovered Instances.
In the Discovered Instances report, discovered instances are listed and each individual
database instance that is discovered has its own row in the report. This report has host,
database protocol, ports, instance name, and more details as the columns.

  • Go to Discover > Classification > Discover Sensitive Data.

The Discover Sensitive Data end-to-end scenario builder streamlines the processes of
discovery, protection, and compliance by integrating several Guardium tools into a single
interface.

  • Click the Plus icon to create a new scenario.

Guardium provides prebuilt sensitive data classification scenarios or templates. You can
reuse the classification policy that is associated with the scenario.

  • Specify a name, select PCI [template] as the classification policy and Relational (SQL) as datasource type.

Provide a name and description for the new discovery scenario. You can also specify security roles that can access the discovery scenario. You have 2 options for datasource type: Relational (SQL) for relational databases like MySQL or Document for document type databases like MongoDB.

  • To expand the rules section, click Next.

  • Review the rules.

Classification policies contain ordered sets of rules and rule actions that identify and act on sensitive data. Each rule in a policy defines a conditional action that is taken when the rule matches. The conditional test can be simple, for example, a wildcard string anywhere in a table or collection, or a complex test that considers multiple conditions. For sensitive data discovery scenarios, the action that is triggered by a rule can add the matches to a specified group or trigger a notification. Multiple grouping and alerting actions can be combined and ordered to create sophisticated responses to matched rules. This wizard guides you through creating and editing classification rules and rule actions in the discovery scenario.

  • To expand the datasources section, click Next.
  • Select the Raptor_DB2 datasource and click right arrow to move it to the selected datasources.

Datasources store information about the database or repository such as the type of
database, the location of the repository, or authentication credentials associated with it.
Adding datasources to a discovery scenario creates a classification process where
classification policies are applied to the selected datasources.
In this task, you identify the datasources to search for sensitive data.

  • Scroll down and click Save.
  • Expand the Run Discovery section and click Run Now. It will take several minutes to complete the discovery.
  • Expand the Review report section and view the results.

You defined policies for discovering sensitive data and identifying datasources to search.
Now, you can run the classification process and review the results.
In this example, Guardium discovered multiple sets of credit card data.
Running the process and reviewing the results enables policy refinement. For example, if
the results are too broad, specify more search criteria. It might be necessary to go through
several iterations to refine policies, run the process, and assess the results to achieve the
wanted results.

  • Expand the audit section and click New.
  • Select audit as the role. Then click OK.

You can define any number of receivers for the results of a discovery workflow, and you
can control the order in which they receive results. In addition, you can specify process
control options, such as whether a receiver needs to sign off on the results before they are
sent to the next receiver.

  • Review the receivers and click Save.

This demonstration discussed discovery and classification. To summarize, data discovery
and data classification help examine content and metadata to identify and classify sensitive
data.

Congratulations, you've reached the end of lab 102.

We reviewed the Discovery and Classification tool.

Click, lab 103 to start the next lab.