Frequently Asked Questions
Learn more about Kogni


Kogni is a data security product. Kogni discovers sensitive data in enterprise data sources, secures it, and continuously monitors for new sensitive data. Kogni helps organizations comply with regulations such as HIPAA, PCI, GDPR, PHI, FERPA, and others.
  • Kogni automatically discovers sensitive data (credit card, ssn)
    • Hadoop, Cloud (S3), NoSQL, Oracle, MySQL
    • Parquet, avro
    • Text
    • Image
  • Secures them by encrypting/masking/ tokenizing/redacting
  • Monitors for new sensitive data locations and suspicious access pattern
Kogni’s sensitive data discovery goes beyond basic regex and checksum:
  • It looks at context words to better decide the semantics of data.
  • When an authorized user flags a column as a certain sensitive type (e.g., medical_acct_nr), Kogni computes the fingerprint of that column’s data. The fingerprint captures the distinguishing characteristics of the column’s data. Kogni uses fingerprints to accurately detect sensitive information in other data stores.
  • Kogni learns from user feedback. Authorized users can accept/reject Kogni’s discovery results. Behind the scenes, Kogni incorporates user feedback into its underlying models to continually improve their accuracy.
Yes.
  • Authorized users can define new classifiers on Kogni UI.
  • Authorized users can also flag an existing database column as the newly defined sensitive type. Behind the scenes, Kogni will fingerprint that column’s data to accurately detect the newly defined sensitive type in other locations.
  • Kogni also supports code-based User Define Classifiers for complex cases requiring custom logic.
Kogni can be configured to send the automatically discovered sensitive data catalog to third party cataloging tools, Cloudera Navigator, or Apache Atlas. This information can be used to define fine-grained access control policies using native RBAC providers such as Apache Sentry, Apache Ranger, Blue Talon, etc.
Kogni secure functionality secures the data as it gets ingested into the Data Lake. If the data is already present on the data lake , we rewrite the data on HDFS when securing a column. If data that is already on the data lake needs to be secured, we provide custom workflows that can be used for doing one-time secure of the data on the data lake.
The performance of scanning a large cluster depends upon the number of resources that are allocated to Kogni inspection jobs, size of data and number of sensitive information types. Based on some of the benchmarks we did we were able to scan and catalog a few hundred terabytes of data within in few hours. Kogni has a number of smarts built in to optimize the scans.
  • Sampling
  • Incremental scanning
  • Uses spark scale-out architecture
  • Yes. Kogni integrates with and complements third party cataloging tools, Navigator, and Atlas. When Kogni discovers new sensitive data, it can automatically push the discovery information to third party cataloging tools, Navigator, and Atlas. This information can be used to define fine-grained access control policies using Apache Sentry or Apache Ranger.
    Yes. The sensitive data catalog is available via API in addition to UI.
    Kogni does not access at the network packet layer. Kogni is a Java-based application and needs read access to hadoop/databases. Kogni is modeled in the same manner as most vulnerability management solutions which utilize authenticated scanning. Kogni requires such access in order to properly discover and catalog the data elements.
    Yes. Kogni automatically finds sensitive data covered by HIPAA Including images (facial images, fingerprint).
    A typical engagement will start with Kogni team doing a data risk assessment. Based on the data risk assessment, Kogni is setup and configured for the customers risk profile. The one-time setup and configuration is included in the license pricing. For new installations, we recommend starting with sampling:
    • Scan only a few tables; and only a sample of rows.
      • Then increase coverage in steps.
      • Schedule for off-peak hours
    Kogni can be used together with tools such as Nmap to first discover the systems available in the environment. Once such systems are identified and credentials for such systems provided to Kogni, Kogni scan can be initiated to identify sensitive data.
    We support multiple licensing models with annual subscription model being the most common. Contact sales for more information.
    Traditional data cataloging tools create data catalogs for structured datasets (tables, columns). Kogni automatically catalogs sensitive data in structured datasets, cloud object stores, log files, images, and scanned documents. After discovery, Kogni secures sensitive data by encrypting/masking /tokenizing/redacting, and it monitors and alerts on Sensitive Data Proliferation and suspicious access of sensitive data.
    Kogni has classifiers for a broad range of sensitive types. The list of classifiers that are currently supported include:

    Global

    1. First Name
    2. Last Name
    3. Credit Card
    4. Email Address
    5. IP Address
    6. Mac Address
    7. URL
    8. Date of birth
    9. Full Face Photographic Image
    10.City
    11.Vehicle Identification Number(VIN)
    12.Password
    13.Credit Card Image
    14.Passport Image
    15.Street Address
    16.Country
    17.Biometric Identifiers
    18.International Phone Number
    19.Drivers License Image
    20.Facial Image
    21.Finger Prints
    1. First Name
    2. Last Name
    3. Credit Card
    4. Email Address
    5. IP Address
    6. Mac Address
    7. URL
    8. Date of birth
    9. Full Face Photographic Image
    10.City
    11.Vehicle Identification Number(VIN)
    12.Password
    13.Credit Card Image
    14.Passport Image
    15.Street Address
    16.Country
    17.Biometric Identifiers
    18.International Phone Number
    19.Drivers License Image
    20.Facial Image
    21.Finger Prints

    USA

    1. Social Security Number
    2. Individual Taxpayer Identification Number (ITIN)
    3. US Passport Number
    4. US Phone Number
    5. US Driver's License (All States)
    6. US State
    7. US Zip Code
    8. Social Security CardImage
    1. Social Security Number
    2. Individual Taxpayer Identification Number (ITIN)
    3. US Passport Number
    4. US Phone Number
    5. US Driver's License (All States)
    6. US State
    7. US Zip Code
    8. Social Security CardImage

    National Identifiers

    1. Austria - ZMR-Zah, ASVG, ssPIN
    2. Belgium BE.ID
    3. Bulgaria EGN
    4. Czech - RČ,ČOP
    5. Slovakia - RČ, ČOP
    6. Denmark - CPR
    7. Estonia - IK
    8. Europe IBAN
    9. Finland HETU
    10.France NIR
    11.Germany - PK, Steuer-ID, VSNR, RVNR
    12.Greece Tautotita
    13.Hungary - TAJ , Szam
    14.India - Aadhar card Number
    15.Ireland - PPS
    16.Italy - CF
    17.Latvia PL
    18.Lituania AK
    19.Netherlands BSN
    20.Norway FN
    21.Poland PESEL
    22.Romania CNF
    23.Spain DNI
    24.Sweden Personnr
    25.Switzerland - AVS, AVS2008
    26.UK NI, NINO, NHS
    27.US - SSN
    1. Austria - ZMR-Zah, ASVG, ssPIN
    2. Belgium BE.ID
    3. Bulgaria EGN
    4. Czech - RČ,ČOP
    5. Slovakia - RČ, ČOP
    6. Denmark - CPR
    7. Estonia - IK
    8. Europe IBAN
    9. Finland HETU
    10.France NIR
    11.Germany - PK, Steuer-ID, VSNR, RVNR
    12.Greece Tautotita
    13.Hungary - TAJ , Szam
    14.India - Aadhar card Number
    15.Ireland - PPS
    16.Italy - CF
    17.Latvia PL
    18.Lituania AK
    19.Netherlands BSN
    20.Norway FN
    21.Poland PESEL
    22.Romania CNF
    23.Spain DNI
    24.Sweden Personnr
    25.Switzerland - AVS, AVS2008
    26.UK NI, NINO, NHS
    27.US - SSN

    Finance

    Health Care

    1. Bank Account Number
    2. ABA Routing Number
    3. IBAN
    4. SWIFT Code
    5. CVV Number
    6. American Bankers CUSIP ID
    1. US national Provider Identifier
    2. US DEA Number
    3. ICD9 Code
    4. ICD10 Code
    5. FDA Code
    1. Bank Account Number
    2. ABA Routing Number
    3. IBAN
    4. SWIFT Code
    5. CVV Number
    6. American Bankers CUSIP ID
    1. US national Provider Identifier
    2. US DEA Number
    3. ICD9 Code
    4. ICD10 Code
    5. FDA Code