This commit is contained in:
NikolajDanger
2023-05-15 14:47:09 +02:00
parent 7a28fadbfa
commit 50f74d9cd8
3 changed files with 41 additions and 17 deletions

Binary file not shown.

View File

@ -2,10 +2,11 @@
\usepackage[margin=1.3in]{geometry} \usepackage[margin=1.3in]{geometry}
\usepackage[most]{tcolorbox} \usepackage[most]{tcolorbox}
\usepackage{xcolor} \usepackage{xcolor}
\usepackage{tikz}
\usepackage{fancyhdr} % for headers \usepackage{fancyhdr} % for headers
% \usepackage[citestyle=verbose-ibid, backend=biber, autocite=footnote]{biblatex} % Footnote references. Use autocite{}. % \usepackage[citestyle=verbose-ibid, backend=biber, autocite=footnote]{biblatex} % Footnote references. Use autocite{}.
\usepackage{biblatex} \usepackage{biblatex}
\usepackage{float}
% --- Configuration --- % --- Configuration ---
\bibliography{src/references} \bibliography{src/references}
@ -31,7 +32,7 @@
\textit{Scientific Workflow Management Systems} (SWMSs) are an essential tool for automating, managing, and executing complex scientific processes involving large volumes of data and computational tasks\footnote{citation?}. Traditional SWMSs employ a linear sequential approach, in which tasks are performed in a pre-defined order, as defined by the workflow. While this linear method is suitable for certain applications, it might not always be the best choice: processing sequentially can prove inefficient in cases where the next step of the process should adapt to the previous one. For these use-cases a dynamic scheduler is required, of which \textit{Managing Event Oriented Workflows}\autocite{DavidMEOW} (MEOW) is one. \textit{Scientific Workflow Management Systems} (SWMSs) are an essential tool for automating, managing, and executing complex scientific processes involving large volumes of data and computational tasks\footnote{citation?}. Traditional SWMSs employ a linear sequential approach, in which tasks are performed in a pre-defined order, as defined by the workflow. While this linear method is suitable for certain applications, it might not always be the best choice: processing sequentially can prove inefficient in cases where the next step of the process should adapt to the previous one. For these use-cases a dynamic scheduler is required, of which \textit{Managing Event Oriented Workflows}\autocite{DavidMEOW} (MEOW) is one.
\begin{tcolorbox}[colback=lightgray!30!white] \begin{tcolorbox}[colback=lightgray!30!white]
More about DAGs' inability to adapt Expand on DAGs' inability to adapt
\end{tcolorbox} \end{tcolorbox}
MEOW employs an event-based scheduler, in which jobs are performed non-linearly, triggered based on events\footnote{citation?}. By dynamically adapting the execution order based on the outcomes of previous tasks or external factors, MEOW provides a more efficient and flexible solution for processing large volumes of experimental data\footnote{citation?}. MEOW employs an event-based scheduler, in which jobs are performed non-linearly, triggered based on events\footnote{citation?}. By dynamically adapting the execution order based on the outcomes of previous tasks or external factors, MEOW provides a more efficient and flexible solution for processing large volumes of experimental data\footnote{citation?}.
@ -51,23 +52,36 @@
While file events work well as a trigger on their own, there are several scenarios where a different trigger would be preferred or even required, especially when dealing with distributed systems or remote operations. To address these shortcomings and further enhance MEOW's capabilities, the integration of network event triggers would provide significant benefits in several key use-cases. While file events work well as a trigger on their own, there are several scenarios where a different trigger would be preferred or even required, especially when dealing with distributed systems or remote operations. To address these shortcomings and further enhance MEOW's capabilities, the integration of network event triggers would provide significant benefits in several key use-cases.
\begin{tcolorbox}[colback=lightgray!30!white]
Introduce the BIDS graph. Draw the BIDS graph in tikz
\end{tcolorbox}
Firstly, network event triggers would allow for manual triggering of jobs remotely, without the need for direct access to the monitored files. This is particularly useful in scenarios where human intervention or decision-making is required before proceeding with the subsequent steps in a workflow. While it is possible to manually trigger job using file events by making changes to the monitored directories, this might lead to an already running job accessing the files at the same time, which could cause problems with data integrity. Firstly, network event triggers would allow for manual triggering of jobs remotely, without the need for direct access to the monitored files. This is particularly useful in scenarios where human intervention or decision-making is required before proceeding with the subsequent steps in a workflow. While it is possible to manually trigger job using file events by making changes to the monitored directories, this might lead to an already running job accessing the files at the same time, which could cause problems with data integrity.
Secondly, incorporating network event triggers would facilitate seamless communication between parallel jobs, ensuring that tasks can efficiently exchange information and synchronize their progress. Secondly, incorporating network event triggers would facilitate seamless communication between parallel jobs, ensuring that tasks can efficiently exchange information and synchronize their progress.
Finally, extending MEOW's event-based scheduler to support network event triggers would enable the simple and efficient exchange of data between workflows running on different machines. This feature is particularly valuable in distributed computing environments, where data processing tasks are often split across multiple systems to maximize resource utilization and minimize latency. By leveraging network event triggers, MEOW would be better equipped to manage complex workflows in these environments, ensuring seamless integration and streamlined data processing Finally, extending MEOW's event-based scheduler to support network event triggers would enable the simple and efficient exchange of data between workflows running on different machines. This feature is particularly valuable in distributed computing environments, where data processing tasks are often split across multiple systems to maximize resource utilization and minimize latency. By leveraging network event triggers, MEOW would be better equipped to manage complex workflows in these environments, ensuring seamless integration and streamlined data processing
One specific example of a use-case where network event triggers could prove useful is the workflow for The Brain Imaging Data Structure (BIDS). The BIDS workflow requires data to be sent between multiple machines and validated by a user. Network event triggers could streamline this process by automatically initiating data transfer tasks when specific conditions are met, thereby reducing the need for manual management. Additionally, network triggers could facilitate user validation by allowing users to manually prompt the continuation of the workflow through specific network requests, simplifying the user's role in the validation process
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.5\textwidth]{src/BIDS.png}
\end{center}
\caption{\textbf{Temp graph. Replace.} The structure of the BIDS workflow.}
\end{figure}
\subsection{Background} \subsection{Background}
\subsubsection{The structure of MEOW} \subsubsection{The structure of MEOW}
The MEOW event-based scheduler has three main parts: \textit{monitors}, \textit{handlers}, and \textit{the conductor}. The MEOW event-based scheduler has three main parts: \textit{monitors}, \textit{handlers}, and \textit{the conductor}.
\begin{tcolorbox}[colback=lightgray!30!white] \begin{figure}[H]
Draw a diagram of the structure. \begin{center}
\end{tcolorbox} \begin{tikzpicture}
\node[draw,rectangle,rounded corners] at (0,0) (con) {Conductor};
\node[draw,rectangle,rounded corners] at (3,-2) (mon) {Monitor};
\node[draw,rectangle,rounded corners] at (-3,-2) (han) {Handler};
\end{tikzpicture}
\end{center}
\caption{\textbf{WIP.} How the three elements of MEOW interact.}
\end{figure}
Monitors monitor for triggering events. They are initialized with a number of \textit{patterns}, which describe the triggering event. When a pattern's triggering event occurs, the monitor signals to the conductor that the pattern has been triggered. Monitors monitor for triggering events. They are initialized with a number of \textit{patterns}, which describe the triggering event. When a pattern's triggering event occurs, the monitor signals to the conductor that the pattern has been triggered.
@ -98,19 +112,26 @@
\end{tcolorbox} \end{tcolorbox}
\section{Method} \section{Method}
\begin{tcolorbox}[colback=lightgray!30!white]
Explain the code I wrote and why I made those choices.
To address the identified limitations of MEOW and to expand its capabilities, I will be incorporating network event triggers into the existing event-based scheduler, to supplement the current file-based event triggers. My method focuses on leveraging Python's socket library to enable the processing of network events. The following subsections detail the specific methodologies employed in expanding the codebase, the design of the network event trigger mechanism, and the integration of this mechanism into the existing MEOW system.
\subsection{Design of the network event pattern}
\begin{tcolorbox}[colback=lightgray!30!white]
\begin{itemize} \begin{itemize}
\item Expanding on existing code, reusing boiler-plate code \item Expanding on existing code, reusing boiler-plate code
\item Test-driven development \item Attempts to preserve loose coupling of modules (any trigger should be able to connect to any handler) (this might not be entirely possible, but it's a good idea to attempt)
\item Experiments with triggering on packet \begin{itemize} \item Experiments with triggering on packet \begin{itemize}
\item Removes the ability to send arbitrary data \item Removes the ability to send arbitrary data
\end{itemize} \end{itemize}
\end{itemize} \end{itemize}
\end{tcolorbox} \end{tcolorbox}
\subsection{Integrating it into the existing codebase}
\begin{tcolorbox}[colback=lightgray!30!white]
Reusing file event triggering by way of temp files.
\end{tcolorbox}
\subsection{Testing}
\section{Results} \section{Results}
\begin{tcolorbox}[colback=lightgray!30!white] \begin{tcolorbox}[colback=lightgray!30!white]
@ -127,6 +148,9 @@
\subsection{Future Work} \subsection{Future Work}
\begin{tcolorbox}[colback=lightgray!30!white] \begin{tcolorbox}[colback=lightgray!30!white]
What should someone do if they want to fix my mistakes, or expand on them further. What should someone do if they want to fix my mistakes, or expand on them further.
\begin{itemize}
\item Implementation of the other options mentioned when discussing the socket library.
\end{itemize}
\end{tcolorbox} \end{tcolorbox}
\section{Conclusion} \section{Conclusion}

BIN
src/BIDS.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB