Frequently Asked Questions

Kogni is a data security product. Kogni discovers sensitive data in enterprise data sources, secures it, and continuously monitors for new sensitive data. Kogni helps organizations comply with regulations such as HIPAA, PCI, GDPR, PHI, FERPA and others.

  • Kogni automatically discovers sensitive data (credit card, ssn)
    • Hadoop, Cloud (S3), NoSQL , Oracle, MySQL
    • Parquet, avro
    • Text
    • Image
  • Secures them by encrypting/masking/ tokenizing/redacting
  • Monitors for new sensitive data locations and suspicious access pattern

Traditional data cataloging tools create data catalogs for structured datasets (tables, columns). Kogni automatically catalogs sensitive data in structured datasets, cloud object stores, log files, images, and scanned documents. After discovery, Kogni secures sensitive data by encrypting/masking/tokenizing/redacting, and it monitors and alerts on Sensitive Data Proliferation and suspicious access of sensitive data.

The performance of scanning a large cluster depends upon the number of resources that are allocated to Kogni inspection jobs, size of data and number of sensitive information types. Based on some of the benchmarks we did we were able to scan and catalog a few hundred terabytes of data within in few hours. Kogni has a number of smarts built in to optimize the scans.
  • Sampling
  • Incremental scanning
  • Uses spark scale-out architecture

  • Authorized users can define new classifiers on Kogni UI.
  • Authorized users can also flag an existing database column as the newly defined sensitive type. Behind the scenes, Kogni will fingerprint that column’s data to accurately detect the newly defined sensitive type in other locations.
  • Kogni also supports code-based User Define Classifiers for complex cases requiring custom logic.

Kogni can be configured to send the automatically discovered sensitive data catalog to third party cataloging tools, Cloudera Navigator, or Apache Atlas . This information can be used to define fine-grained access control policies using native RBAC providers such as Apache Sentry, Apache Ranger, Blue Talon, etc.

Kogni can be used together with tools such as Nmap to first discover the systems available in the environment. Once such systems are identified and credentials for such systems provided to Kogni, Kogni scan can be initiated to identify sensitive data.

We support multiple licensing models with annual subscription model being the most common. Contact sales for more information.

Kogni has classifiers for a broad range of sensitive types. The list of classifiers that are currently supported include:


1. First Name
2. Last Name
3. Credit Card
4. Email Address
5. IP Address
6. Mac Address
7. URL
8. Date of birth
9. Full Face Photographic Image
11.Vehicle Identification Number(VIN)
13.Credit Card Image
14.Passport Image
15.Street Address
17.Biometric Identifiers
18.International Phone Number
19.Drivers License Image
20.Facial Image
21.Finger Prints


1. Social Security Number
2. Individual Taxpayer Identification Number (ITIN)
3. US Passport Number
4. US Phone Number
5. US Driver's License (All States)
6. US State
7. US Zip Code
8. Social Security CardImage

National Identifiers

1. Austria - ZMR-Zah, ASVG, ssPIN
2. Belgium BE.ID
3. Bulgaria EGN
4. Czech - RČ,ČOP
5. Slovakia - RČ, ČOP
6. Denmark - CPR
7. Estonia - IK
8. Europe IBAN
9. Finland HETU
10.France NIR
11.Germany - PK, Steuer-ID, VSNR, RVNR
12.Greece Tautotita
13.Hungary - TAJ , Szam
14.India - Aadhar card Number
15.Ireland - PPS
16.Italy - CF
17.Latvia PL
18.Lituania AK
19.Netherlands BSN
20.Norway FN
21.Poland PESEL
22.Romania CNF
23.Spain DNI
24.Sweden Personnr
25.Switzerland - AVS, AVS2008
27.US - SSN


1. Bank Account Number
2. ABA Routing Number
4. SWIFT Code
5. CVV Number
6. American Bankers CUSIP ID

Health Care

1. US national Provider Identifier
2. US DEA Number
3. ICD9 Code
4. ICD10 Code
5. FDA Code

Yes. Kogni integrates with and complements third party cataloging tools, Navigator, and Atlas. When Kogni discovers new sensitive data, it can automatically push the discovery information to third party cataloging tools, Navigator, and Atlas. This information can be used to define fine-grained access control policies using Apache Sentry or Apache Ranger.

Yes. Kogni integrates with third party encryption and tokenization tools.

Yes. The sensitive data catalog is available via API in addition to UI.

A typical engagement will start with Kogni team doing a data risk assessment. Based on the data risk assessment , Kogni is setup and configured for the customers risk profile. The one-time setup and configuration is included in the license pricing. For new installations, we recommend starting with sampling:

  • Scan only a few tables; and only a sample of rows.
    • Then increase coverage in steps.
    • Schedule for off-peak hours

Kogni secure functionality secures the data as it gets ingested into the Data Lake. If the data is already present on the data lake , we rewrite the data on HDFS when securing a column. If data that is already on the data lake needs to be secured, we provide custom workflows that can be used for doing one-time secure of the data on the data lake.

Yes. Kogni automatically finds sensitive data covered by HIPAA Including images (facial images, fingerprint).

Kogni does not access at the network packet layer. Kogni is a Java-based application and needs read access to hadoop/databases . Kogni is modeled in the same manner as most vulnerability management solutions which utilize authenticated scanning. Kogni requires such access in order to properly discover and catalog the data elements.

Want to effectively discover and respond to threats?

Explore our 90-day free trial now!

Contact Us