\documentclass[a4paper,11pt]{article} \usepackage[margin=1.3in]{geometry} \usepackage[most]{tcolorbox} \usepackage{xcolor} \usepackage{tikz} \usepackage{fancyhdr} % for headers % \usepackage[citestyle=verbose-ibid, backend=biber, autocite=footnote]{biblatex} % Footnote references. Use autocite{}. \usepackage{biblatex} \usepackage{float} \usepackage{fontspec} \usepackage{enumitem} \usepackage{array} \usetikzlibrary{arrows.meta, positioning, calc, quotes} % --- Configuration --- \bibliography{src/references} \setmonofont[Scale=0.85, ItalicFont=Hermit Light]{Hermit Light} % \pagestyle{fancy} % \setlength{\parskip}{6pt} % \setlength{\parindent}{0pt} % \fancyfoot{} % \lhead{\rightmark} % \rhead{\thepage} % \fancyheadoffset{0.005\textwidth} \setlength{\parskip}{5pt} \newcolumntype{P}[1]{>{\centering\arraybackslash}p{#1}} \begin{document} \section{Abstract} \begin{tcolorbox}[colback=lightgray!30!white] Explain briefly the paper and what it does. \end{tcolorbox} \section{Introduction} \textit{Scientific Workflow Management Systems} (SWMSs) are an essential tool for automating, managing, and executing complex scientific processes involving large volumes of data and computational tasks. Jobs in a SWMS workflows are typically defined as the nodes in a Directed Acyclic Graph (DAG), where the edges define the dependencies of each job. \begin{figure}[H] \begin{center} \begin{tikzpicture}[ arrow/.style={-Triangle, thick,shorten >=4pt} ] \node[draw,circle] at (0,0) (j1) {Job 1}; \node[draw,circle] at (3,2) (j2) {Job 2}; \node[draw,circle] at (3,0) (j3) {Job 3}; \node[draw,circle] at (3,-2) (j4) {Job 4}; \node[draw,circle] at (6,1) (j5) {Job 5}; \node[draw,circle] at (9,-0.5) (j6) {Job 6}; \draw[arrow] (j1) -- (j2); \draw[arrow] (j1) -- (j3); \draw[arrow] (j1) -- (j4); \draw[arrow] (j2) -- (j5); \draw[arrow] (j3) -- (j5); \draw[arrow] (j4) -- (j6); \draw[arrow] (j5) -- (j6); \end{tikzpicture} \caption{A workflow defined as a DAG. Job 2, 3, and 4 are dependent on the completion of Job 1, etc.} \end{center} \end{figure} While this method is suitable for many applications, it may not always be the best solution. Processing the jobs in a set order can lead to inefficiencies in cases where the processing of the jobs needs to adapt based on the results of earlier jobs, human interaction, or changing circumstances. In these contexts, the DAG method might fall short due to its inherently static nature. In such scenarios, using a \textit{dynamic scheduler} can offer a more effective approach. Unlike traditional DAG-based systems, dynamic schedulers are designed to adapt dynamically to changing conditions, providing a more adaptive method for managing complex workflows. One such dynamic scheduler is the \textit{Managing Event Oriented Workflows}\autocite{DavidMEOW} (MEOW). MEOW employs an event-based scheduler, in which jobs are executed independently, based on certain \textit{triggers}. Triggers can in theory be anything, but are currently limited to file events on local storage. By dynamically adapting the execution order based on the outcomes of previous tasks or external factors, MEOW provides a more flexible solution for processing large volumes of experimental data, with minimal human validation and interaction\autocite{DavidMEOWpaper}. \begin{figure}[H] \begin{center} \begin{tikzpicture}[ arrow/.style={-Triangle, thick,shorten >=4pt} ] \node[draw,rectangle] at (0,0) (t1) {Trigger 1}; \node[draw,rectangle] at (0,-1.5) (t2) {Trigger 2}; \node[draw,rectangle] at (0,-3) (t3) {Trigger 3}; \node[draw,rectangle] at (0,-4.5) (t4) {Trigger 4}; \node[draw,circle] at (6,0) (j1) {Job 1}; \node[draw,circle] at (6,-1.5) (j2) {Job 2}; \node[draw,circle] at (6,-3) (j3) {Job 3}; \node[draw,circle] at (6,-4.5) (j4) {Job 4}; \draw[arrow] (t1) -- (j1); \draw[arrow] (t2) -- (j2); \draw[arrow] (t3) -- (j3); \draw[arrow] (t4) -- (j4); \end{tikzpicture} \caption{A workflow using an event-based system. Job 1 is dependent on Trigger 1, etc.} \end{center} \end{figure} In this project, I introduce triggers for network events into MEOW. This enables a running scheduler to react to and act on data transferred over a network connection. By incorporating this feature, the capability of MEOW is significantly extended, facilitating the management of not just local file-based workflows, but also complex, distributed workflows involving communication between multiple systems over a network. In this report, I will walk through the design and implementation process of this feature, detailing the challenges encountered and how they were overcome. \subsection{Problem} In its current implementation, MEOW is able to trigger jobs based on changes to monitored local files. This covers a range of scenarios where the data processing workflow involves the creation, modification, or removal of files. By monitoring file events, MEOW's event-based scheduler can dynamically execute tasks as soon as the required conditions are met, ensuring efficient and timely processing of the data. Since the file monitor is triggered by changes to local files, MEOW is limited to local workflows. While file events work well as a trigger on their own, there are several scenarios where a different trigger would be preferred or even required, especially when dealing with distributed systems or remote operations. To address these shortcomings and further enhance MEOW's capabilities, the integration of network event triggers would provide significant benefits in several key use-cases. Firstly, network event triggers would enable the initiation of jobs remotely through the transmission of a triggering message to the monitor, thereby eliminating the necessity for direct access to the monitored files. This is particularly useful in human-in-the-loop scenarios, where human intervention or decision-making is required before proceeding with the subsequent steps in a workflow. While it is possible to manually trigger job using file events by making changes to the monitored directories, this might lead to an already running job accessing the files at the same time, which could cause problems with data integrity. Secondly, incorporating network event triggers would facilitate seamless communication between parallel workflows, ensuring that tasks can efficiently exchange information and updates on their progress, allowing for a better perspective on the combined workflow, greatly improving visibility and control. Finally, extending MEOW's event-based scheduler to support network event triggers would enable the simple and efficient exchange of data between workflows running on different machines. This feature is particularly valuable in distributed computing environments, where data processing tasks are often split across multiple systems to maximize resource utilization and minimize latency. Integrating network event triggers into MEOW would provide an advantage specifically in the context of heterogeneous workflows, which incorporate a mix of different tasks running on diverse computing environments. By their nature, these workflows can involve tasks running on different systems, potentially even in different physical locations, which need to exchange data or coordinate their progress. In the figure below, an example heterogeneous workflow is presented. \begin{figure}[H] \begin{center} \includegraphics[width=\textwidth]{src/heterogeneous.png} \end{center} \caption{An example of a heterogeneous workflow} \end{figure} The example workflow requires several checkpoints in which data should be transferred between the instrument, the instrument storage, centralized storage, High Performance Computing (HPC) resources, and a human interaction point. Network events can, for the reasons outlined earlier in the section, be used to prevent the workflow from halting when these points are reached. \subsection{Background} \subsubsection{The structure of MEOW} The MEOW event-based scheduler consists of four main components: \textit{monitors}, \textit{handlers}, \textit{conductors}, and \textit{the runner}. Monitors listen for triggering events. They are initialized with a number of \textit{rules}, which each include a \textit{pattern} and \textit{recipe}. \textit{Patterns} describe the triggering event. For file events, the patterns describe a path that should trigger the event when changed. \textit{Recipes} describe the specific action that should be taken when the rule is triggered. When a pattern's triggering event occurs, the monitor sends an event, which contains the rule and the specifics of the event, to the event queue. Handlers manage the event queue. They unpack and analyze events in the event queue. If they are valid, they create a directory containing the script defined by the recipe. The location of the directory is then sent to the runner, to be added to the job queue. Conductors manage the jobs queue. They execute the jobs in the locations specified by the handlers. Finally, the runner is the main program that orchestrates all these components. Each instance of the runner incorporates at least one instance of a monitor, handler, and conductor, and it holds the event and job queues. \begin{figure}[H] \begin{center} \begin{tikzpicture}[ element/.style={draw, rectangle, rounded corners, minimum height = 1cm}, arrow/.style={-Triangle, ultra thick,shorten >=4pt} ] \node[element,text width=8cm,align=center,fill=orange!30!white] at (0,2) (run) {Runner}; \node[element,fill=cyan!30!white] at (-2,1.3) (eq) {Event Queue}; \node[element,fill=yellow!50!white] at (2,1.3) (jq) {Job Queue}; \node[element,fill=blue!30!white] at (-5,-1.5) (mon) {Monitor}; \node[text width=2cm,align=center] at (-5,-2.8) {Listens for triggering events}; \node[element,fill=green!30!white] at (0,-4) (han) {Handler}; \node[text width=2cm,align=center] at (0,-5.35) {Validates events and creates jobs}; \node[element,fill=red!40!white] at (5,-1.5) (con) {Conductor}; \node[text width=2cm,align=center] at (5,-2.6) {Executes jobs}; \draw[arrow] (mon) -- (eq) node[pos=0.5,above left=-10pt,text width=2cm, align=center] {Schedules events}; \draw[arrow] (eq) -- (han) node[pos=0.8,below left=-20pt,text width=2cm, align=center] {Pulls events}; \draw[arrow] (han) -- (jq) node[pos=0.2,right,text width=2cm, align=center] {Schedules jobs}; \draw[arrow] (jq) -- (con) node[pos=0.5,above right=-10pt,text width=2cm, align=center] {Pulls jobs}; \end{tikzpicture} \end{center} \caption{How the elements of MEOW interact} \end{figure} \begin{figure}[H] \begin{center} \begin{tikzpicture}[ element/.style={draw, rectangle, rounded corners, minimum height = 1cm, text width=2cm, align=center}, every edge/.style={-Triangle, draw, ultra thick, bend left, text width= 2cm, align=center,shorten >=5pt,shorten <=5pt}, bend angle = 15 ] \node[element,fill=blue!30!white,anchor=south] at (90:2.5) (mon) {\textbf{Monitor}}; \node[element,fill=cyan!30!white,anchor=south west] at (30:2) (eq) {\textbf{Event Queue}}; \node[element,fill=green!30!white,anchor=north west] at (330:2) (han) {\textbf{Handler}}; \node[element,fill=yellow!50!white,anchor=north] at (270:2.5) (jq) {\textbf{Job Queue}}; \node[element,fill=red!40!white,anchor=north east] at (210:2) (con) {\textbf{Conductor}}; \node[element,fill=lightgray!80!white,anchor=south east] at (150:2) (sto) {\textbf{Storage}}; \draw (mon) edge ["Schedules events on"] (eq); \draw (eq) edge ["Events are interpreted by"] (han); \draw (han) edge ["Schedules jobs to"] (jq); \draw (jq) edge ["Jobs executed by"] (con); \draw (con) edge ["Writes output to"] (sto); \draw (sto) edge ["Events are seen by"] (mon); \end{tikzpicture} \end{center} \caption{The cycle of MEOW's file events} \end{figure} \subsubsection{The \texttt{meow\_base} codebase} \texttt{meow\_base}\autocite{MeowBase} is an implementation of MEOW written in python. It is written to be modular, using base classes for each element in order to ease the implementation of additional handlers, monitors, etc. The relevant parts of the implementation are: \begin{itemize} \setlength{\itemsep}{0pt} \item \textbf{Events} are python dictionaries, containing the following items:\begin{itemize}[topsep=-10pt] \setlength{\itemsep}{-5pt} \item \texttt{EVENT\_PATH}: The path of the triggering file. \item \texttt{EVENT\_TYPE}: The type of event. File events have the type \texttt{"watchdog"}, since the files are monitored using the \texttt{watchdog} python module. \item \texttt{EVENT\_RULE}: The rule that triggered the event, which contains the recipe that the handler will turn into a job. \item \texttt{EVENT\_TIME}: The timestamp of the triggering event. \item Any extra data supplied by the monitor. File events are by default initialized with the base directory of the event and a hash of the event's triggering path. \end{itemize} \item \textbf{Event patterns} inherit from the \texttt{BasePattern} class. An instance of an event pattern class describes a specific trigger a monitor should be looking for. \item \textbf{Monitors} inherit from the \texttt{BaseMonitor} class. They listen for set triggers (defined by given event patterns), and create events when those triggers happen. The file event monitor uses the \texttt{Watchdog} module to monitor given directories for changes. The Watchdog monitor is initialized with an instance of the \texttt{WatchdogEventHandler} class to handle the watchdog events. When the Watchdog monitor is triggered by a file event, the \texttt{handle\_event} method is called on the event handler, which in turn creates an \texttt{event} based on the specifics of the triggering event. The event is then sent to the runner to be put in the even queue. \item \textbf{The runner} is implemented as the class \texttt{MeowRunner}. When initialized with at least one instance of a monitor, handler, and conductor, it validates them. When started, all the monitors, handlers, and conductors it was initialized with are started. It also creates \texttt{pipes} for the communication between each element and the runner. \item \textbf{Recipes} inherit from the \texttt{BaseRecipe} class. They serve primarily as a repository for the specific details of a given recipe. This typically includes identifying the particular script to be executed, but also contain validation checks of these instructions. The contained data and procedures in a recipe collectively describe the distinct actions to be taken when a corresponding job is executed. \item \textbf{Handlers} inherit from the \texttt{BaseHandler} class. Handler classes are for a specific type of job, like the execution of bash scripts. When started, it enters an infinite loop, where it repeatedly asks the runner for a valid event in the event queue, and then creates a job for the recipe, and sends it to the runner to put in the job queue. \item \textbf{Conductors} inherit from the \texttt{BaseConductor} class. Conductor classes are for a specific type of job, like the execution of bash scripts. When started, it enters an infinite loop, where it repeatedly asks the runner for a valid job in the job queue, and then attempts to execute it. \end{itemize} \subsubsection{The \texttt{socket} library} The \texttt{socket} library\autocite{SocketDoc}, included in the Python Standard Library, serves as an interface for the Berkeley sockets API. The Berkeley sockets API, originally developed for the Unix operating system, has become the standard for network communication across multiple platforms. It allows programs to create 'sockets', which are endpoints in a network communication path, for the purpose of sending and receiving data. Many other libraries and modules focusing on transferring data exist for Python, some of which may be better in certain MEOW use-cases. The \texttt{ssl} library, for example, allows for ssl-encrypted communication, which may be a requirement in workflows with sensitive data. However, implementing network triggers using exclusively the \texttt{socket} library will provide MEOW with a fundamental implementation of network events, which can later be expanded or improved with other features (see section \textit{\ref{Additional Monitors}}). In my project, all sockets use the Transmission Control Protocol (TCP), which ensures safe data transfer by enforcing a stable connection between the sender and receiver. I make use of the following socket methods, which have the same names and functions in the \texttt{socket} library and the Berkeley sockets API: \begin{itemize} \setlength{\itemsep}{0pt} \item \texttt{bind()}: Associates the socket with a given local IP address and port. It also reserves the port locally. \item \texttt{listen()}: Puts the socket in a listening state, where it waits for a sender to request a TCP connection to the socket. \item \texttt{accept()}: Accepts the incoming TCP connection request, creating a connection. \item \texttt{recv()}: Receives data from the given socket. \item \texttt{close()}: Closes a connection to a given socket. \end{itemize} During testing of the monitor, the following methods are used to send data to the running monitor: \begin{itemize} \setlength{\itemsep}{0pt} \item \texttt{connect()}: Sends a TCP connection request to a listening socket. \item \texttt{sendall()}: Sends data to a socket. \end{itemize} \section{Method} To address the identified limitations of MEOW and to expand its capabilities, I will be incorporating network event triggers into the existing event-based scheduler, to supplement the current file-based event triggers. My method focuses on leveraging Python's socket library to enable the processing of network events. The following subsections detail the specific methodologies employed in expanding the codebase, the design of the network event trigger mechanism, and the integration of this mechanism into the existing MEOW system. \subsection{Design of the network event pattern} In the implementation of a pattern for network events, a key consideration was to integrate it seamlessly with the existing MEOW codebase. This required designing the pattern to behave similarly to the file event pattern when interacting with other elements of the scheduler. A central principle in this design was maintaining the loose coupling between patterns and recipes, minimizing direct dependencies between separate components. While this might not be possible for every theoretical recipe and pattern, designing for it could greatly improve future compatibility. The \texttt{NetworkEventPattern} class is initialized with a triggering port, analogous to the triggering path used in file event patterns. This approach inherently limits the number of unique patterns to the number of ports that can be opened on the machine. However, given the large number of potential ports, this constraint is unlikely to present a practical issue. An alternative approach could have involved triggering patterns using a part of the sent message, essentially acting as a "header". However, this would complicate the process since the monitor is otherwise designed to receive raw data. To keep the implementation as straightforward as possible and to allow for future enhancements, I opted for simplicity and broad utility over complexity in this initial design. When the \texttt{NetworkMonitor} instance is started, it starts a number of \texttt{Listener} instances, equal to the number of ports specified in its patterns. The list of patterns is pulled when starting the monitor, so patterns added in runtime are included. Patterns not associated with a rule are not considered, since they will not result in an event. Only one listener is started per port, so patterns with the same port use the same listener. When matching an event with a rule, all rules are considered, so if multiple rules use the same triggering port, they will all be triggered. The listeners each open a socket connected to their respective ports. This is consistent with the behavior of the file event monitor, which monitors the triggering paths of the patterns it was initialized with. \subsection{Integrating network events into the existing codebase} The data received by the network monitor is written as a stream to a temporary file, in chunks of 2048 bytes. The temp files are created using the built-in \texttt{tempfile} library, and are placed in the os's default directory for temporary files. The library is used to accommodate different operating systems, as well as to ensure the files have unique names. When the monitor is stopped, all generated temporary files will be removed. This design choice serves three purposes: Firstly, this method is a practical solution for managing memory usage during data transfer, particularly for large data sets. By writing received data directly to a file 2048 bytes at a time, we bypass the need to store the entire file in memory at once, effectively addressing potential memory limitations. Secondly, the method allows the monitor to receive multiple files simultaneously, since receiving the file will be done by separate threads. This means that a single large file will not "block up" the network port for too long. Lastly, this approach allows the leveraging of existing infrastructure built for file events. The newly written temporary file is passed as the "triggering path" of the event, mirroring the behavior of file events. This approach allows network events to utilize the recipes initially designed for file events without modification, preserving the principle of loose coupling. This integration maintains the overall flexibility and efficiency of MEOW while extending its capabilities to handle network events. The method will be slower, since writing to storage takes longer than keeping the data in memory, but I have decided that the positives outweigh the negatives. \subsubsection{Data Type Agnosticism} An important aspect to consider in the functioning of the network monitor is its data type agnosticism: the network monitor does not impose restrictions or perform checks on the type of incoming data. While this approach enhances the speed and simplicity of the implementation, it also places a certain level of responsibility on the recipes that work with the incoming data. The recipes, being responsible for defining the actions taken upon execution of a job, must be designed with a full understanding of this versatility. They should incorporate necessary checks and handle potential inconsistencies or anomalies that might arise from diverse types of incoming data. \subsection{Testing} The unit tests for the network event monitor were inspired by the already existing tests for the file event monitor. Since the aim of the monitor was to emulate the behavior of the file event monitor as closely as possible, using the already existing tests with minimal changes proved an effective way of staying close to that goal. The tests verify the following behavior: \begin{itemize} \setlength{\itemsep}{0pt} \item Network event patterns can be initialized, and raise exceptions when given invalid parameters. \item Network events can be created, and they contain the expected information. \item Network monitors can be created. \item A network monitor is able to receive data sent to a listener, write it to a file, and create a valid event. \item You can access, add, update, and remove the patterns and recipes associated with the monitor at runtime. \item When adding, updating, or removing patterns or recipes during runtime, rules associated with those patterns ore recipes are updated accordingly. \item The monitor only initializes listeners for patterns with associated rules, and rules updated during runtime are applied. \end{itemize} \section{Results} The testing suite designed for the monitor comprised of 26 distinct tests, all of which successfully passed. These tests were designed to assess the robustness, reliability, and functionality of the monitor. They evaluated the monitor's ability to successfully manage network event patterns, detect network events, and communicate with the runner to send events to the event queue. \subsection{Performance Tests} To assess the performance of the Network Monitor, I have implemented a number of performance tests. The tests were run on these machines: \begin{table}[H] \centering \begin{tabular}{|c||c|c|c|c|}\hline \textbf{Identifier} & \textbf{CPU} & \textbf{Cores} & \textbf{Clock speed} & \textbf{Memory} \\ \hline Laptop & Intel i5-8250U & 4 & 1.6GHz & 8GB \\ \hline Desktop & & & & \\ \hline \end{tabular} \end{table} \subsubsection{Single Listener} To assess how a single listener handles many events at once, I implemented a procedure where a single listener in the monitor was subjected to a varying number of events, ranging from 1 to 1,000. For each quantity of events, I sent n network events to the monitor and recorded the response time. To ensure reliability of the results and mitigate the effect of any outliers, each test was repeated 50 times. Given the inherent variability in network communication and event handling, I noted considerable differences between the highest and lowest recorded times for each test. To provide a comprehensive view of the monitor's performance, I have included not only the average response times, but also the minimum and maximum times observed for each set of 50 tests. \begin{table}[H] \centering \begin{tabular}{|p{1.1cm}||P{1.5cm}|P{1.8cm}||P{1.5cm}|P{1.8cm}||P{1.5cm}|P{1.8cm}|} \hline \textbf{Event} & \multicolumn{2}{c||}{\textbf{Minimum time}} & \multicolumn{2}{c||}{\textbf{Maximum time}} & \multicolumn{2}{c|}{\textbf{Average time}} \\ \textbf{count} & Total & Per event & Total & Per event & Total & Per event \\ \hline\hline \multicolumn{7}{|c|}{\textbf{Laptop}} \\ \hline 1 & 0.68ms & 0.68ms & 5.3ms & 5.3ms & 2.1ms & 2.1ms \\\hline 10 & 4.7ms & 0.47ms & 2.1s & 0.21s & 0.18s & 18ms \\\hline 100 & 45ms & 0.45ms & 7.2s & 72ms & 0.86s & 8.6ms \\\hline 1,000 & 0.63s & 0.63ms & 17s & 17ms & 5.6s & 5.6ms \\\hline\hline \multicolumn{7}{|c|}{\textbf{Desktop}} \\ \hline 1 & & & & & & \\\hline 10 & & & & & & \\\hline 100 & & & & & & \\\hline 1000 & & & & & & \\\hline \end{tabular} \caption{The results of the Single Listener performance tests with 2 significant digits.} \end{table} \begin{figure}[H] \centering \includegraphics[width=0.8\textwidth]{src/performance_results/laptop_single_listener.png} \caption{The results of the Single Listener performance test plotted logarithmically.} \end{figure} Upon examination of the results, an pattern emerges. The minimum recorded response times consistently averaged around 0.5ms per event, regardless of the number of events sent. This time likely reflects an ideal scenario where events are registered seamlessly without any delays or issues within the pipeline, thereby showcasing the efficiency potential of the network event triggers in the MEOW system. In contrast, the maximum and average response times exhibited more variability. This fluctuation in response times may be attributed to various factors such as network latency, the internal processing load of the system, and the inherent unpredictability of concurrent event handling. \subsubsection{Multiple Listeners} The next performance test investigates how the introduction of multiple listeners affects the overall processing time. This test aims to understand the implications of distributing events across different listeners on system performance. Specifically, we're looking at how having multiple listeners in operation might impact the speed at which events are processed. In this test, I will maintain a constant total of 1000 events, but they will distributed evenly across varying numbers of listeners: 1, 10, 100, and 1000. By keeping the total number of events constant while altering the number of listeners, I aim to isolate the effect of multiple listeners on system performance. A key expectation for this test is to observe if and how much the overall processing time increases as the number of listeners goes up. This would give insight into whether operating more listeners concurrently introduces additional overhead, thereby slowing down the process. The results of this test would then inform decisions about optimal listener numbers in different usage scenarios, potentially leading to performance improvements in MEOW's handling of network events. \begin{table}[H] \centering \begin{tabular}{|p{1.5cm}||P{2.5cm}|P{2.5cm}|P{2.5cm}|} \hline \textbf{Listener} & \textbf{Minimum time} & \textbf{Maximum time} & \textbf{Average time} \\ \hline \multicolumn{4}{|c|}{\textbf{Laptop}} \\ \hline 1 & 0.63s & 17s & 5.6s \\\hline 10 & 0.46s & 25s & 7.6s \\\hline 100 & 0.42s & 20s & 7.1s \\\hline 1000 & 0.92s & 3.24s & 1.49s \\\hline \multicolumn{4}{|c|}{\textbf{Desktop}} \\ \hline 1 & & & \\\hline 10 & & & \\\hline 100 & & & \\\hline 1000 & & & \\\hline \end{tabular} \caption{The results of the Multiple Listeners performance tests with 2 significant digits.} \end{table} \begin{figure}[H] \centering \includegraphics[width=0.8\textwidth]{src/performance_results/laptop_multiple_listeners.png} \caption{The results of the Multiple Listeners performance test plotted logarithmically.} \end{figure} % \subsection{Discussion} \subsection{Future Work} \subsubsection{Use-cases for Network Events} Since the purpose of the project was adding a feature to a workflow manager, it's important to consider its integration within real-life workflows and consider future workflow designs that will capitalize on Network Events. One specific example of an application where network event triggers could prove useful is the workflow for The Brain Imaging Data Structure (BIDS). The BIDS workflow requires data to be sent between multiple machines and validated by a user. Network event triggers could streamline this process by automatically initiating data transfer tasks when specific conditions are met, thereby reducing the need for manual management. Additionally, network triggers could facilitate user validation by allowing users to manually prompt the continuation of the workflow through specific network requests, simplifying the user's role in the validation process. \begin{figure}[H] \begin{center} \includegraphics[width=0.6\textwidth]{src/BIDS.png} \end{center} \caption{The structure of the BIDS workflow. Data is transferred to user, and to the cloud.} \end{figure} \subsubsection{Additional Monitors}\label{Additional Monitors} The successful development and implementation of the network event monitor for MEOW serves as a precedent for the creation of additional monitors in the future. This framework could be utilized as a blueprint for developing new monitors tailored to meet specific demands, protocols, or security requirements. For instance, security might play a crucial role in the processing and transfer of sensitive data across various workflows. The network event monitor developed in this project, which uses the Python socket library, might not satisfy the security requirements of all workflows, especially those handling sensitive data. In such cases, developing a monitor that leverages the \texttt{ssl} library could provide a solution, enabling encrypted communication and thus improving the security of data transfer. The architecture of the network event monitor can guide the development of an \texttt{ssl} monitor, taking advantage of the similarities between the \texttt{socket} and \texttt{ssl} libraries. Similarly, we could envision monitors developed specifically for certain protocols. For example, a monitor designed to handle HTTP requests could be beneficial for workflows interacting with web services. As HTTP is a common protocol, this type of monitor would open up a vast array of potential interactions with external services, making MEOW even more versatile. \section{Conclusion} With the monitor performing effectively as tested, it can be anticipated that it will handle network event triggers correctly in live environments. This is a critical enhancement for MEOW, opening up possibilities for more complex, distributed, and heterogeneous workflows, as envisioned in the design objectives. \newpage \appendix \printbibliography{} \end{document}