This commit is contained in:
NikolajDanger
2023-05-23 11:01:41 +02:00
parent de580a1b91
commit cf4816052f
2 changed files with 36 additions and 12 deletions

Binary file not shown.

View File

@ -107,19 +107,33 @@
\end{figure}
\subsubsection{The \texttt{meow\_base} codebase}
\begin{tcolorbox}[colback=lightgray!30!white]
Specific (but not too granular) implementation details of \texttt{meow\_base}.
\texttt{meow\_base}\autocite{MeowBase} is an implementation of MEOW written in python. It is written to be modular, using base classes for each element in order to ease the implementation of additional handlers, monitors, etc.
\begin{tcolorbox}[colback=blue!30!white]
How much should I include here?
\end{tcolorbox}
The current implementation of MEOW, \texttt{meow\_base}\autocite{MeowBase}, \dots
\begin{tcolorbox}[colback=lightgray!30!white]
\begin{itemize}
\item The runner
\item Conductors
\item Recipes and handlers
\item File event monitor (Watchdog)
\item Events (important to clarify how file events work since I refer to it in the method section)
\item Testing
\end{itemize}
\end{tcolorbox}
\subsubsection{The \texttt{socket} library}
The \texttt{socket} library\autocite{SocketDoc}, included in the Python Standard Library, serves as an interface for the Berkeley sockets API. The Berkeley sockets API, originally developed for the Unix operating system, has become the standard for network communication across multiple platforms. It allows programs to create 'sockets', which are endpoints in a network communication path, for the purpose of sending and receiving data.
Many other libraries and modules focusing on transferring data exist for Python, some of which may be better in certain MEOW use-cases. The \texttt{ssl} library, in specific, allows for ssl-encrypted communication, which may be a requirement in workflows with sensitive data. However, implementing network triggers using the \texttt{socket} library will provide MEOW with a basic implementation of network events, which can later be expanded or improved with other features.
Many other libraries and modules focusing on transferring data exist for Python, some of which may be better in certain MEOW use-cases. The \texttt{ssl} library, in specific, allows for ssl-encrypted communication, which may be a requirement in workflows with sensitive data. However, implementing network triggers using the \texttt{socket} library will provide MEOW with a fundamental implementation of network events, which can later be expanded or improved with other features.
In my project, all sockets use the Transmission Control Protocol (TCP), which ensures safe data transfer by enforcing a stable connection between the sender and receiver. I make use of the following socket methods, which have the same names and functions in the \texttt{socket} library and the Berkeley sockets API:
In my project, all sockets use the Transmission Control Protocol (TCP), which ensures safe data transfer by enforcing a stable connection between the sender and receiver.
I make use of the following socket methods, which have the same names and functions in the \texttt{socket} library and the Berkeley sockets API:
\begin{tcolorbox}[colback=blue!30!white]
Too granular?
@ -131,24 +145,34 @@
\item \texttt{listen()}: Puts the socket in a listening state, where it waits for a sender to request a TCP connection to the socket.
\item \texttt{accept()}: Accepts the incoming TCP connection request, creating a connection.
\item \texttt{recv()}: Receives data from the given socket.
\item \texttt{connect()}: Sends a TCP connection request to a listening socket. This is only used in testing the monitor.
\item \texttt{sendall()}: Sends data a socket. This is only used in testing the monitor.
\item \texttt{close()}: Closes a connection to a given socket.
\end{itemize}
During testing of the monitor, the following methods are used to send data to the running monitor:
\begin{itemize}
\setlength{\itemsep}{-5pt}
\item \texttt{connect()}: Sends a TCP connection request to a listening socket.
\item \texttt{sendall()}: Sends data a socket.
\end{itemize}
\section{Method}
To address the identified limitations of MEOW and to expand its capabilities, I will be incorporating network event triggers into the existing event-based scheduler, to supplement the current file-based event triggers. My method focuses on leveraging Python's socket library to enable the processing of network events. The following subsections detail the specific methodologies employed in expanding the codebase, the design of the network event trigger mechanism, and the integration of this mechanism into the existing MEOW system.
\subsection{Design of the network event pattern}
A main concern with implementing a pattern for network events is to seamlessly integrate it with the existing codebase. Because of this, the design of the pattern has a heavy focus on behaving similarly to the file event pattern when interacting with the other elements of the scheduler. Ideally, this should preserve loose coupling of the patterns and recipes, so any pattern can be put in a rule with any recipe. While this might not be possible for every theoretical recipe and pattern, designing for it could greatly improve future compatibility.
In the implementation of a pattern for network events, a key consideration was to integrate it seamlessly with the existing MEOW codebase. This required designing the pattern to behave similarly to the file event pattern when interacting with other elements of the scheduler. A central principle in this design was maintaining the loose coupling between patterns and recipes, minimizing direct dependencies between separate components. While this might not be possible for every theoretical recipe and pattern, designing for it could greatly improve future compatibility.
Network event patterns are initialized with a triggering port, similar to the triggering path of the file event patterns. While this limits the amount of possible unique patterns to the amount of ports that can be opened on the machine, that amount is large enough that it will likely not be an issue. It would have been possible to have the patterns be triggered by part of the sent message, acting as a "header". However, this would complicate the process, since the monitor will otherwise be expecting to receive raw data. This was chosen in order for the implementation to be as simple as possible, so that any feature or improvement can be added later as its own pattern type.
Network event patterns are initialized with a triggering port, analogous to the triggering path used in file event patterns. This approach inherently limits the number of unique patterns to the number of ports that can be opened on the machine. However, given the large number of potential ports, this constraint is unlikely to present a practical issue. An alternative approach could have involved triggering patterns using a part of the sent message, essentially acting as a "header". However, this would complicate the process since the monitor is otherwise designed to receive raw data. To keep the implementation as straightforward as possible and to allow for future enhancements, I opted for simplicity over complexity in this initial design.
The network monitor, when started, opens sockets that start listening on the ports specified in the patterns it was initialized with.
Once the network monitor is started, it opens sockets that start listening on the each of the ports specified in the patterns it was initialized with. This is consistent with the behavior of the file event monitor, which monitors the triggering paths of the patterns it was initialized with.
\subsection{Integrating it into the existing codebase}
Data received by the network monitor is written to a temporary file, which serves two purposes. Firstly, writing the received data to a file while receiving it saves on memory, since the entire file doesn't have to be saved in memory at once. This is especially useful for large data transfers. Secondly, writing the received data to a file allows network events to reuse most of the infrastructure written for file events, passing the newly written temporary file as the "triggering path" of the event. This means that recipes taking the triggering path as their input can still be used with network events, preserving loose coupling.
\subsection{Integrating network events into the existing codebase}
The data received by the network monitor is written to a temporary file, a design choice that serves two purposes.
Firstly, this method is a practical solution for managing memory usage during data transfer, particularly for large data sets. By writing received data directly to a file, we bypass the need to store the entire file in memory at once, effectively addressing potential memory limitations.
Secondly, this approach allows the leveraging of existing infrastructure built for file events. The newly written temporary file is passed as the "triggering path" of the event, mirroring the behavior of file events. This approach allows network events to utilize the recipes initially designed for file events without modification, preserving the principle of loose coupling. This integration maintains the overall flexibility and efficiency of MEOW while extending its capabilities to handle network events.
\subsection{Testing}