Application Usage¶
Installation¶
Prerequisites¶
Installation Command¶
Clone this repository and run poetry install
to install with dependencies.
Usage¶
See psykoda -h
.
This application is designed for periodical execution and manual inspection and response to anomalies.
digraph G
{
{
node [shape=box]
log [label="IDS log"]
el [label="Exclude Lists"]
Config
Anomalies
FP [label="False Positives"]
TP [label="True Positives"]
}
{
node [shape=ellipse style=dashed]
human [label="Manual Inspection"]
response
}
{
node [shape=ellipse]
ees [label="extract-exclude-screening"]
feature [label="Feature Extraction"]
split [label="train-apply Split"]
detection [label="training and detection"]
}
log -> ees
el -> ees
ees -> feature -> split -> detection -> Anomalies -> human -> FP -> feature
human -> TP -> response
Config:s -> {ees:e, feature:e, split:e, detection:e} [style=dotted]
}
A range of dates is required to execute detection. For each date in the range, a model is trained with IDS log and False positives before that date. Then, the model is applied with IDS log of that date to detect anomalies of that date.
Input¶
IDS log¶
The main input of this application is IDS log files. Each log record must include following fields:
timestamp
source IP address
destination IP address
destination port number
IDS signature ID
type | support |
---|---|
FS-CSV | fully supported |
snort-CSV | partially tested for Snort 2.x |
snort-syslog | on roadmap |
Snort 3 | on roadmap |
Exclude lists¶
Pattern matching of field values can be used to exclude some part of log from whole analysis.
Data Type | Pattern |
---|---|
IP address | CIDR format |
other | exact match |
Labels¶
Known normal samples help semi-supervised anomaly detection to reduce false positives. They can be provided as a list of (timestamp, source IP address)es. Feature values for them will be constructed from corresponding log records saved in previous executions.
Configuration¶
All configuration goes to a configuration file, whose path should be passed as --path_config
required option.
Configuration file should be in JSON and include an object.
Refer to:
API Reference for keys and definitions.
config.json
included in this repository for an example.Best Practices for Working with Configuration in Python Applications when modifying the example app to create your own app with configuration: Use dataclasses to define configuration and dacite to convert from a
dict
.
Output¶
Inside the output directory specified in config, a subdirectory will be created for each detection unit.
File | Description |
---|---|
stats.json |
metadata |
report.csv |
anomalies detected |
plot_detection.png |
visualization |
Log records for each anomaly will be also provided, for both manual inspection and use as known false positives in subsequent executions.