✨
This commit is contained in:
Binary file not shown.
@ -34,14 +34,15 @@
|
||||
\textit{Scientific Workflow Management Systems} (SWMSs) are an essential tool for automating, managing, and executing complex scientific processes involving large volumes of data and computational tasks\footnote{citation?}. Traditional SWMSs employ a linear sequential approach, in which tasks are performed in a pre-defined order, as defined by the workflow. While this linear method is suitable for certain applications, it might not always be the best choice: processing sequentially can prove inefficient in cases where the next step of the process should adapt to the previous one. For these use-cases a dynamic scheduler is required, of which \textit{Managing Event Oriented Workflows}\autocite{DavidMEOW} (MEOW) is one.
|
||||
|
||||
\begin{tcolorbox}[colback=lightgray!30!white]
|
||||
Expand on DAGs' inability to adapt
|
||||
Expand on DAGs' inability to adapt. Plagiarize David's thesis.
|
||||
\end{tcolorbox}
|
||||
|
||||
MEOW employs an event-based scheduler, in which jobs are performed non-linearly, triggered based on events\footnote{citation?}. By dynamically adapting the execution order based on the outcomes of previous tasks or external factors, MEOW provides a more efficient and flexible solution for processing large volumes of experimental data\footnote{citation?}.
|
||||
MEOW employs an event-based scheduler, in which jobs are performed non-linearly (\textbf{Better word here}), triggered based on events\footnote{citation?}. By dynamically adapting the execution order based on the outcomes of previous tasks or external factors, MEOW provides a more efficient and flexible solution for processing large volumes of experimental data\footnote{citation?}.
|
||||
|
||||
|
||||
\begin{tcolorbox}[colback=lightgray!30!white]
|
||||
\begin{itemize}
|
||||
\item Expand on what "efficient" is
|
||||
\item What work am I doing on MEOW?
|
||||
\item How did it go?
|
||||
\item Introduce the concept of network events.
|
||||
@ -55,13 +56,13 @@
|
||||
|
||||
While file events work well as a trigger on their own, there are several scenarios where a different trigger would be preferred or even required, especially when dealing with distributed systems or remote operations. To address these shortcomings and further enhance MEOW's capabilities, the integration of network event triggers would provide significant benefits in several key use-cases.
|
||||
|
||||
Firstly, network event triggers would allow for manual triggering of jobs remotely, without the need for direct access to the monitored files. This is particularly useful in scenarios where human intervention or decision-making is required before proceeding with the subsequent steps in a workflow. While it is possible to manually trigger job using file events by making changes to the monitored directories, this might lead to an already running job accessing the files at the same time, which could cause problems with data integrity.
|
||||
Firstly, network event triggers would allow for manual triggering of jobs remotely, without the need for direct access to the monitored files. This is particularly useful in human-in-the-loop scenarios, where human intervention or decision-making is required before proceeding with the subsequent steps in a workflow. While it is possible to manually trigger job using file events by making changes to the monitored directories, this might lead to an already running job accessing the files at the same time, which could cause problems with data integrity.
|
||||
|
||||
Secondly, incorporating network event triggers would facilitate seamless communication between parallel runners, ensuring that tasks can efficiently exchange information and synchronize their progress.
|
||||
Secondly, incorporating network event triggers would facilitate seamless communication between parallel runners, ensuring that tasks can efficiently exchange information and updates on their progress, allowing for a better perspective on the whole workflow, greatly improving visibility and control.
|
||||
|
||||
Finally, extending MEOW's event-based scheduler to support network event triggers would enable the simple and efficient exchange of data between workflows running on different machines. This feature is particularly valuable in distributed computing environments, where data processing tasks are often split across multiple systems to maximize resource utilization and minimize latency.
|
||||
|
||||
Integrating network event triggers into MEOW would provide an advantage specifically in the context of heterogeneous workflows, which incorporate a mix of different tasks running on diverse computing environments. By their nature, these workflows can involve tasks running on different systems, potentially even in different physical locations, which need to exchange data or coordinate their progress. Currently, MEOW's reliance on local file events as triggers can be a limiting factor in these scenarios. Network event triggers offer a powerful solution to this challenge. They can not only handle tasks running across different machines, but also dynamically adapt to the changing requirements of a heterogeneous workflow, such as triggering new tasks based on the results of remote computations. Thus, the addition of network event triggers is a significant step in enhancing MEOW's already robust handling of heterogeneous workflows, bolstering its utility in today's diverse and distributed computing landscape.
|
||||
Integrating network event triggers into MEOW would provide an advantage specifically in the context of heterogeneous workflows, which incorporate a mix of different tasks running on diverse computing environments. By their nature, these workflows can involve tasks running on different systems, potentially even in different physical locations, which need to exchange data or coordinate their progress. In the figure below, an example heterogeneous workflow is presented.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{center}
|
||||
@ -70,6 +71,8 @@
|
||||
\caption{An example of a heterogeneous workflow}
|
||||
\end{figure}
|
||||
|
||||
The example workflow requires several "halting-points", in which data should be transferred between the instrument, the instrument storage, centralized storage, High Performance Computing (HPC) resources, and a human interaction point. Network events can, for the reasons outlined earlier in the section, be used to prevent the workflow from halting when these points are reached.
|
||||
|
||||
\subsection{Background}
|
||||
\subsubsection{The structure of MEOW}
|
||||
|
||||
@ -81,7 +84,7 @@
|
||||
\begin{center}
|
||||
\includegraphics[width=0.6\textwidth]{src/monitor.png}
|
||||
\end{center}
|
||||
\caption{The monitor's role in MEOW's event-based system.}
|
||||
\caption{\textbf{Redo this to fit with the current version.} The monitor's role in MEOW's event-based system.}
|
||||
\end{figure}
|
||||
|
||||
\begin{tcolorbox}[colback=blue!30!white]
|
||||
|
Reference in New Issue
Block a user