Parallel flows - Forking a flow

Parallel flows are still sequential - but they flow along more than one path at the same time. Thus they are also parallel. Maybe you want to process data in two different ways at the same time, e.g. not only you want to transform the words of a text but also log them.

fig4.gif

Of course both activities should not concern the splitting of the text. So what you can do is, instead of passing the words directly to the transformation you can fork the word flow and pass them to the transformation stage and the logging stage at the same time.

var flow = Flow<string>
    .Do<string>(SplitIntoWords)
    .Fork()
        .Do<string, EmptyValue>(TransformWords, LogWords);

A fork is a flow stage which has one input "channel" and (currently) two output "channels". All those channels can be of the same type as in the above example. Or they can be all different.

fig5.gif

If they are all the same, then the input value can be automatically passed on to all the branches like above. But you always can take control of that and do your own fork processing, e.g.

var flow = new ForkedFlow<Tuple<string, int>, string, int>(
    (tp, ps, pi) => { ps.Post(tp.Item0);
                      pi.Post(tp.Item1); }
    );

The example scenario thus could also have been implemented like this:

var flow = Flow<string>
    .Fork<string>(ForkedSplitIntoWords)
        .Do<string, EmptyValue>(TransformWords, LogWords);
...
void ForkedSplitIntoWords(string text, Port<string> words0, Port<string> words1)
{
    foreach (string word in text.Split(' '))
    {
        words0.Post(word);
        words1.Post(word);
    }
}

Compare this to the first forking solution: The forked splitting knows about the forking. It has two output ports. That makes the splitting logic dependent on a certain kind of flow, namely a forked flow. To foster re-use of stages you should avoid this. Keep the forking code as dump as possible. The first solution can use the splitting stage in many different contexts; it´s just taking a text and producing words. That´s as simple as it can get - and can be combined with other stages before or after it.

Multi stage branches

As you already saw above, a fork has two output "channels" both of which can be connected to input "channels" of subsequent stages. You do this by putting a Do<>() after a fork:

var flow = Flow<int>
  .Fork<int, string, bool>((i, ps, pb)=>{ps.Post(i.ToString()); pb.Post(i % 2==0);})
    .Do<...>(ProcessStrings, ProcessEvenNumbers);

The result of the Fork<>() as well as its Do<>() return a ForkedFlow<> object with an output "channel" for each branch.

Of course the parallel paths after a fork can consist of more than one stage. They are (sub)flows in their own right. Consider this forked flow:

fig6.gif

You could implement it like this (leaving out the "channel" types):

var flow = new ForkedFlow<,,>(...)
    .Do<,>(new Flow<>.Do<>(A).Do<>(B),
           new Flow<>.Do<>(C).Do<>(D).Do<>(E));

As you can see, the Do<>() of the ForkedFlow<> is used to nest sub-flows within the main flow. That means, you could of course also nest another fork in an outer fork´s branch:

var flow = new ForkedFlow<,,>(...)
    .Do<,>(new ForkedFlow<,,>(...).Do<,>(...)..., 
           ...);

This works - but needs one more details: joins. A flow that forks needs to be joined if it´s nested. So read on about JoiningFlows.

Last edited Jul 2, 2009 at 3:42 PM by ralfw, version 6

Comments

No comments yet.