I really don’t like overloading pipes like this. I would rather chain methods like how the django orm does it.
you could reassign every line, but it would look nicer with chained functions.
pipeline = task(get_data, branch=True)
pipeline = pipeline | task(step1, workers=20)
pipeline = pipeline | task(step2, workers=20)
pipeline = pipeline | task(step3, workers=20, multiprocess=True)
edit:I would be tempted to do something like this:
steps = [task(step1, workers=20),
task(step2, workers=20),
task(step3, workers=20, multiprocess=True)]
pipeline = task(get_data, branch=True)
for step in steps:
pipeline = pipeline.__or__(step)
According to the docs, | is syntactic sugar for the .pipe method.
pipeline = task(get_data, branch=True).pipe(
task(step1, workers=20)).pipe(
task(step2, workers=20)).pipe(
task(step3, workers=20, multiprocess=True))
That's probably the chained method approach for those with this preference. This style looks pretty good to me:
pipeline = task(...)
pipeline |= task(...)
So does this style: steps = [task(...), task(...)]
pipeline = functools.reduce(operator.or_, steps)
But it appears you can just change "task" to "Task" and then: pipeline = pyper.Pipeline([Task(...), Task(...)])