Browse Prior Art Database

Using graph database for modeling large computer infrastructures Disclosure Number: IPCOM000248663D
Publication Date: 2016-Dec-22
Document File: 6 page(s) / 166K

Publishing Venue

The Prior Art Database


Due to ever increasing size of computing environments (tens or hundreds of thousands of machines) managing data about them is challenging and it demands utilization of significant processing power. Our idea is going to optimize that. Core idea is to optimize software discovery scans in the environments that have to be monitored. There are several solutions for achieving this, but we propose to solve the issue by using graph database and modelling whole environment (a lot of machines/endpoints) in tree structure what results with performance improvements in comparison to existing solutions. Initially every machine that belongs to the monitored environment has to be scanned (file listing with file paths is expected to be gathered) and this set of information has to be modeled in Environment Model Database (graph database). File system scan is being executed on the endpoint and then scan results are being sent to dedicated server (Environment Model Database Server) where Environment Model Database can be stored.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.


Using graph database for modeling large computer infrastructures

When new file system scan results will be sent to Environment Model Database Server the database will be queried to detect every change in file system structure.

Example scenario

Assumption 1:

We have three different computer systems (A, B and C) and Environment Model Database Server (EMDS) that can trigger scans on other endpoints (by calling appropriate action to be executed by its Agents installed on every machine in the monitored environment) (Fig. 1).

Fig. 1. EMDS network

Assumption 2:

To reduce size of data that has to be sent from computer system to EMDS it is needed to define some rules that describe what information are expected by EMDS Server.

Let's define the rules as follow:

– size of file is less than or equal to 10 MB

– file extension has to belong to the following collection: .sys, .sys2, .sh, .bin, .exe …

– the following locations have to be ignored during scan: /tmp …

Gathered data have to be sent by the Agent to EMDS Server. EMDS Server have to be able to parse scan results and import to database.

Importing data to Environment Model Database


1. Importing data from the first Linux computer system

Received data is being parsed and file system model is being created in Environment Model Database (EMDB). Every file and its path will be modeled in tree structured where 'db-root' will be first node. Every directory from the path will be represented by separate node and file name will be n-1 node. Last node in every path is being reserved for computer id. (fig. 2).

Fig. 2. EMDB model for first Linux computer

Between each file node and computer node there is a 'is on' relation. Every computer node can be connected with many file nodes ('one to many' relation).

2. Importing data from second Linux computer system


Received data is being parsed and every file with the path is being sought in existing EMDB model.

a) if path does not exist new path (directory nodes and file node) are being created in the model, new computer node is being created and linked to file node.

b) if path exists (at latest part of them) then missing path (directory nodes) and file node are being created, new computer node is being created and linked to file node.

c) if both file and path exist but new file has different property (e.g. size or other property which will be gathered), new file node is being created, new computer node is being created and linked to file node.

c) if both path and file exist (all file properties also are being matched with...