How to access the ADS-IDS aggregated data from MongoDB
This article outlines how to access aggregated NEtflow, ADS and IDS data on mongoDB.
Data Flow Overview
Data Ingestion via Logstash: ADS and IDS data are first written into OpenSearch using Logstash. This step centralizes raw network security event data in an on-premises OpenSearch instance.
Serverless Function Processing: Serverless functions then read the ADS/IDS data from OpenSearch. These functions aggregate the raw data into a summarized format for easier consumption.
Aggregated Data Storage: The aggregated results are written into a MongoDB database, which is hosted on the Lambda server at UND CEM.
User Consumption: The final aggregated data in MongoDB is made available for user consumption, as detailed in this document.
Before you start
Make sure that:
You can reach the lambda server
ping <Lambda machine's ip address'>Expected response:
PING lambda-scalar.ad.und.edu (172.16.232.253) 56(84) bytes of data. 64 bytes from <Lambda ip>:: icmp_seq=1 ttl=61 time=14.7 ms 64 bytes from <Lambda ip>: icmp_seq=2 ttl=61 time=14.1 ms ^C --- <Lambda ip>: ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5009ms rtt min/avg/max/mdev = 13.187/14.400/15.936/0.830 msIf you are not able to ping the server, you would not be able to reach the server that hosts the database, or the data it hosts.
Access to DB: Documents in the DB are protected and would require credentials to access the data. To obtain these credentials, please contact the DB administrator. This guide assumes you already have these access credentials.
How to Access Data from the Database
The following steps will outline how to consume the aggregated data from MongoDB.
Create a Virtual Environment (Optional):
If you are not using a Jupyter Notebook, it's a good practice to create and activate a virtual environment.python -m venv env source env/bin/activate # On Windows, use: env\Scripts\activateInstall PyMongo: Install the PyMongo package, which is the official MongoDB driver for Python.
pip install pymongoAccessing the data: Below is a sample script to query a single entry; from pymongo import MongoClient
from pymongo import MongoClient ####################### Credentials: Do not push this block to GitHub username = <username: str> password = <password: str> database_name = "c2sr_testbed_aggregated" host = <Lambda ip> port = 27017 ###################### Do not push this block to GitHub uri = f"mongodb://{username}:{password}@{host}:{port}/{database_name}?authSource={database_name}" client = MongoClient(uri) db = client[database_name] collections = db.list_collection_names() print("Collections in the database:", collections) collection = db[<collection name>] document = collection.find_one() print("One document from the collection:", document)Detailed documentation about the db, collections is avaiable here.