c2sr-testbed-user-docs 0.0.5 Help

How to access the ADS-IDS aggregated data from MongoDB

This article outlines how to access aggregated NEtflow, ADS and IDS data on mongoDB.

Data Flow Overview

  • Data Ingestion via Logstash: ADS and IDS data are first written into OpenSearch using Logstash. This step centralizes raw network security event data in an on-premises OpenSearch instance.

  • Serverless Function Processing: Serverless functions then read the ADS/IDS data from OpenSearch. These functions aggregate the raw data into a summarized format for easier consumption.

  • Aggregated Data Storage: The aggregated results are written into a MongoDB database, which is hosted on the Lambda server at UND CEM.

  • User Consumption: The final aggregated data in MongoDB is made available for user consumption, as detailed in this document.

Before you start

Make sure that:

  • You can reach the lambda server

    ping <Lambda machine's ip address'>

    Expected response:

    PING lambda-scalar.ad.und.edu (172.16.232.253) 56(84) bytes of data. 64 bytes from <Lambda ip>:: icmp_seq=1 ttl=61 time=14.7 ms 64 bytes from <Lambda ip>: icmp_seq=2 ttl=61 time=14.1 ms ^C --- <Lambda ip>: ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5009ms rtt min/avg/max/mdev = 13.187/14.400/15.936/0.830 ms

    If you are not able to ping the server, you would not be able to reach the server that hosts the database, or the data it hosts.

  • Access to DB: Documents in the DB are protected and would require credentials to access the data. To obtain these credentials, please contact the DB administrator. This guide assumes you already have these access credentials.

How to Access Data from the Database

The following steps will outline how to consume the aggregated data from MongoDB.

  1. Create a Virtual Environment (Optional):
    If you are not using a Jupyter Notebook, it's a good practice to create and activate a virtual environment.

    python -m venv env source env/bin/activate # On Windows, use: env\Scripts\activate
  2. Install PyMongo: Install the PyMongo package, which is the official MongoDB driver for Python.

    pip install pymongo
  3. Accessing the data: Below is a sample script to query a single entry; from pymongo import MongoClient

    from pymongo import MongoClient ####################### Credentials: Do not push this block to GitHub username = <username: str> password = <password: str> database_name = "c2sr_testbed_aggregated" host = <Lambda ip> port = 27017 ###################### Do not push this block to GitHub uri = f"mongodb://{username}:{password}@{host}:{port}/{database_name}?authSource={database_name}" client = MongoClient(uri) db = client[database_name] collections = db.list_collection_names() print("Collections in the database:", collections) collection = db[<collection name>] document = collection.find_one() print("One document from the collection:", document)
  4. Detailed documentation about the db, collections is avaiable here.

Last modified: 07 May 2025