Application Usage

Installation

Prerequisites

Install Python and Poetry.

Installation Command

Clone this repository and run poetry install to install with dependencies.

Usage

See psykoda -h.

This application is designed for periodical execution and manual inspection and response to anomalies.

digraph G
{
  {
    node [shape=box]
    log [label="IDS log"]
    el [label="Exclude Lists"]
    Config
    Anomalies
    FP [label="False Positives"]
    TP [label="True Positives"]
  }
  {
    node [shape=ellipse style=dashed]
    human [label="Manual Inspection"]
    response
  }
  {
    node [shape=ellipse]
    ees [label="extract-exclude-screening"]
    feature [label="Feature Extraction"]
    split [label="train-apply Split"]
    detection [label="training and detection"]
  }
  log -> ees
  el -> ees
  ees -> feature -> split -> detection -> Anomalies -> human -> FP -> feature
  human -> TP -> response
  Config:s -> {ees:e, feature:e, split:e, detection:e} [style=dotted]
}

A range of dates is required to execute detection. For each date in the range, a model is trained with IDS log and False positives before that date. Then, the model is applied with IDS log of that date to detect anomalies of that date.

Input

IDS log

The main input of this application is IDS log files. Each log record must include following fields:

  • timestamp

  • source IP address

  • destination IP address

  • destination port number

  • IDS signature ID

type support
FS-CSV fully supported
snort-CSV partially tested for Snort 2.x
snort-syslog on roadmap
Snort 3 on roadmap

Exclude lists

Pattern matching of field values can be used to exclude some part of log from whole analysis.

Data Type Pattern
IP address CIDR format
other exact match

Labels

Known normal samples help semi-supervised anomaly detection to reduce false positives. They can be provided as a list of (timestamp, source IP address)es. Feature values for them will be constructed from corresponding log records saved in previous executions.

Configuration

All configuration goes to a configuration file, whose path should be passed as --path_config required option. Configuration file should be in JSON and include an object.

Refer to:

Output

Inside the output directory specified in config, a subdirectory will be created for each detection unit.

File Description
stats.json metadata
report.csv anomalies detected
plot_detection.png visualization

Log records for each anomaly will be also provided, for both manual inspection and use as known false positives in subsequent executions.