Skip to content

kirillovmr/python-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Data Pipeline

Process any type of data in your projects easily, control the flow of your data!

Python 3.5, 3.6, 3.7 PyPI version


Installation

Install smart-pipeline with:

pip install --upgrade smart-pipeline

Usage

Package 'smart_pipeline' provides a Pipeline class:

# Import Pipeline class
from smart_pipeline import Pipeline

# Create an instance
pl = Pipeline()

Pipeline class has 3 types of pipes: item, data and stat.

Item pipe modifies each item in dataset without changing the whole population of data:

data = [1,2,3,4,5]

# Define an item function
def addOne(item):
    return item + 1

# Adds function into pipeline
pl.addItemPipe(addOne)
# Pass the data through pipeline
res = pl(data)

# res = [2,3,4,5,6]

Data pipe is a filter:

data = [1,2,3,4,5]

def onlyOdd(item):
    return False if item%2==0 else True

pl.addDataPipe(onlyOdd)
res = pl(data)

# res = [1,3,5]

Stat pipe reduces over the data, passing the accumulated value to each element:

data = [1,2,3,4,5]

# Function that goes over all items in dataset
def countNumberStat(stats, item):
    stats["total"] += 1
    if item%2==0:
        stats["even"] += 1
    else:
        stats["odd"] += 1
    return stats

# Function to be called at the end with accumulated stats
def printNumberStat(stats):
    print(stats["total"], "items were processed in total.")
    print(stats["even"], "of them are even.")
    print(stats["odd"], "of them are odd")

# Make sure to pass initial state as 3rd argument
pl.addStatPipe(countNumberStat, printNumberStat, { "total":0, "even":0, "odd":0 })
pl(data)

# Output:
# 5 items were processed in total.
# 2 of them are even.
# 3 of them are odd


If this library solved some of your problems, please consider starring the project 😉

And feel free to create pull requests!

About

Process any type of data in your projects easily, control the flow of your data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published