Skip to content

Commit 81906a2

Browse files
committed
Merge branch 'master' into 393-foster-import
2 parents 2e713e3 + 3cd8f80 commit 81906a2

20 files changed

+722
-929
lines changed

README.md

Lines changed: 19 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -12,61 +12,33 @@ animal care programs), administration and development efforts are coordinated by
1212

1313
## [The Data Pipeline](https://codeforphilly.org/projects/paws_data_pipeline)
1414

15-
This project seeks to provide PAWS with an easy-to-use and easy-to-support tool to extract
16-
data from multiple source systems, confirm accuracy and appropriateness,
17-
clean/validate data where necessary (a data hygiene and wrangling step),
18-
and then load relevant data into one or more repositories to facilitate
19-
(1) a highly-accurate and rich 360-degree view of PAWS constituents
20-
(Salesforce is a likely candidate target system; already in use at PAWS) and
21-
(2) flexible ongoing data analysis and insights discovery (e.g. a data lake / data warehouse).
22-
2315
Through all of its operational and service activities, PAWS accumulates data regarding donations,
2416
adoptions, fosters, volunteers, merchandise sales, event attendees (to name a few),
25-
each in their own system and/or manual (Google Sheet) tally. This vital data that can
17+
each in their own system and/or manual tally. This vital data that can
2618
drive insights remains siloed and is usually difficult to extract, manipulate, and analyze.
27-
Taking all of this data, making it readily available, and drawing inferences through analysis
28-
can drive many benefits:
29-
30-
- PAWS operations can be better informed and use data-driven decisions to guide programs
31-
and maximize effectiveness;
32-
- Supporters can be further engaged by suggesting additional opportunities for involvement
33-
based upon pattern analysis;
34-
- Multi-dimensional supporters can be consistently (and accurately) acknowledged for all
35-
the ways they support PAWS (i.e. a volunteer who donates and also fosters kittens),
36-
not to mention opportunities to further tap the potential of these enthusiastic supporters.
37-
38-
## [Code of Conduct](https://codeforphilly.org/pages/code_of_conduct)
39-
40-
This is a Code for Philly project operating under their code of conduct.
41-
42-
## Getting started
43-
see [Getting Started](GettingStarted.md) to run the app locally
4419

45-
## Project Plan
20+
This project provides PAWS with an easy-to-use and easy-to-support tool to extract
21+
constituent data from multiple source systems, standardize extracted data, match constituents across data sources,
22+
load relevant data into Salesforce, and run an automation in Salesforce to produce an RFM score.
23+
Through these processes, the PAWS data pipeline has laid the groundwork for facilitating an up-to-date 360-degree view of PAWS constituents, and
24+
flexible ongoing data analysis and insights discovery.
4625

47-
### Phase 1 (now - Jan 15 2020)
26+
## Uses
4827

49-
**Goal**: Create a central storage of data where
28+
- The pipeline can inform the PAWS development team of new constiuents through volunteer or foster engagegement
29+
- Instead of manually matching constituents from volunteering, donations and foster/adoptions, PAWS staff only need to upload the volunteer dataset into the pipeline, and the pipeline handles the matching
30+
- Volunteer and Foster data are automatically loaded into the constituent's SalesForce profile
31+
- An RFM score is calculated for each constituent using the most recent data
32+
- Data analyses can use the output of the PDP matching logic to join datasets from different sources; PAWS can benefit from such analyses in the following ways:
33+
- PAWS operations can be better informed and use data-driven decisions to guide programs and maximize effectiveness;
34+
- Supporters can be further engaged by suggesting additional opportunities for involvement based upon pattern analysis;
35+
- Multi-dimensional supporters can be consistently (and accurately) acknowledged for all the ways they support PAWS (i.e. a volunteer who donates and also fosters kittens), not to mention opportunities to further tap the potential of these enthusiastic supporters.
5036

51-
1. Datasets from top 3 relevant sources can be uploaded as csvs to a central system: a) Donors, b) Volunteers,
52-
c) Adopters
53-
2. All datasets in the central system can be linked to each other on an ongoing basis
54-
3. Notifications can be sent out to relevant parties when inconsistencies need to be handled by a human
55-
4. Comprehensive report on a person’s interactions with PAWS can be pulled via a simple UI (must include full known history)
56-
57-
### Phase 2 (Jan 15 - May 15 2020)
58-
59-
**Goal**: Expand above features to include all relevant datasets and further automate data uploads
60-
Datasets from all other relevant sources can be uploaded as csvs to a central system ( a) Adoption and Foster applicants,
61-
b) Foster Parents, c) Attendees, d) Clinic Clients e) Champions, f) Friends)
62-
Where APIs exist, create automated calls to those APIs to pull data
63-
64-
### Phase 3 (May 15 - Sept 15 2020)
37+
## [Code of Conduct](https://codeforphilly.org/pages/code_of_conduct)
6538

66-
**Goal**: Create more customizable analytics reports and features (eg noshow rates in clinicHQ)
39+
This is a Code for Philly project operating under their code of conduct.
6740

6841
## Links
6942

70-
[Slack Channel](https://codeforphilly.org/chat?channel=paws_data_pipeline)
71-
72-
[Google Drive](https://drive.google.com/open?id=1O8oPWLT5oDL8q_Tm4a0Gt8XCYYxEIcjiPJYHm33lXII)
43+
[Slack Channel](https://codeforphilly.org/chat?channel=paws_data_pipeline)
44+
[Wiki](https://github.com/CodeForPhilly/paws-data-pipeline/wiki)

src/client/src/pages/Admin.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,7 @@ export default function Admin(props) {
243243
<Typography variant="h5" styles={{paddingBottom: 5}}>Run New Analysis</Typography>
244244
<form onSubmit={handleExecute}>
245245
<Button type="submit" variant="contained" color="primary"
246-
disabled={statistics === 'Running' || isNewFileExist === false}>
246+
disabled={statistics === 'Running'}>
247247
Run Data Analysis
248248
</Button>
249249
</form>

src/helm-chart/templates/ingress.yaml

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,7 @@
11
{{- if .Values.ingress.enabled -}}
22
{{- $fullName := include "helm-chart.fullname" . -}}
33
{{- $svcPort := .Values.service.port -}}
4-
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
5-
apiVersion: networking.k8s.io/v1beta1
6-
{{- else -}}
7-
apiVersion: extensions/v1beta1
8-
{{- end }}
4+
apiVersion: networking.k8s.io/v1
95
kind: Ingress
106
metadata:
117
name: {{ $fullName }}
@@ -33,9 +29,12 @@ spec:
3329
paths:
3430
{{- range .paths }}
3531
- path: {{ . }}
32+
pathType: Prefix
3633
backend:
37-
serviceName: {{ $fullName }}
38-
servicePort: {{ $svcPort }}
34+
service:
35+
name: {{ $fullName }}
36+
port:
37+
number: {{ $svcPort }}
3938
{{- end }}
4039
{{- end }}
4140
{{- end }}
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
"""postgres matching
2+
3+
Revision ID: 45a668fa6325
4+
Revises: fc7325372396
5+
Create Date: 2022-02-10 16:19:13.283250
6+
7+
"""
8+
from alembic import op
9+
import sqlalchemy as sa
10+
from sqlalchemy.dialects import postgresql
11+
12+
# revision identifiers, used by Alembic.
13+
revision = '45a668fa6325'
14+
down_revision = 'fc7325372396'
15+
branch_labels = None
16+
depends_on = None
17+
18+
19+
def upgrade():
20+
# ### commands auto generated by Alembic - please adjust! ###
21+
op.create_table('manual_matches',
22+
sa.Column('source_type_1', sa.String(), nullable=False),
23+
sa.Column('source_id_1', sa.String(), nullable=False),
24+
sa.Column('source_type_2', sa.String(), nullable=False),
25+
sa.Column('source_id_2', sa.String(), nullable=False),
26+
sa.PrimaryKeyConstraint('source_type_1', 'source_id_1', 'source_type_2', 'source_id_2')
27+
)
28+
op.create_table('salesforcecontacts',
29+
sa.Column('_id', sa.Integer(), nullable=False),
30+
sa.Column('contact_id', sa.String(), nullable=True),
31+
sa.Column('first_name', sa.String(), nullable=True),
32+
sa.Column('last_name', sa.String(), nullable=True),
33+
sa.Column('account_name', sa.String(), nullable=True),
34+
sa.Column('mailing_country', sa.String(), nullable=True),
35+
sa.Column('mailing_street', sa.String(), nullable=True),
36+
sa.Column('mailing_city', sa.String(), nullable=True),
37+
sa.Column('mailing_state_province', sa.String(), nullable=True),
38+
sa.Column('mailing_zip_postal_code', sa.String(), nullable=True),
39+
sa.Column('phone', sa.String(), nullable=True),
40+
sa.Column('mobile', sa.String(), nullable=True),
41+
sa.Column('email', sa.String(), nullable=True),
42+
sa.Column('json', postgresql.JSONB(astext_type=sa.Text()), nullable=True),
43+
sa.Column('created_date', sa.DateTime(), nullable=True),
44+
sa.PrimaryKeyConstraint('_id')
45+
)
46+
op.create_table('shelterluvpeople',
47+
sa.Column('_id', sa.Integer(), nullable=False),
48+
sa.Column('firstname', sa.String(), nullable=True),
49+
sa.Column('lastname', sa.String(), nullable=True),
50+
sa.Column('id', sa.String(), nullable=True),
51+
sa.Column('internal_id', sa.String(), nullable=True),
52+
sa.Column('associated', sa.String(), nullable=True),
53+
sa.Column('street', sa.String(), nullable=True),
54+
sa.Column('apartment', sa.String(), nullable=True),
55+
sa.Column('city', sa.String(), nullable=True),
56+
sa.Column('state', sa.String(), nullable=True),
57+
sa.Column('zip', sa.String(), nullable=True),
58+
sa.Column('email', sa.String(), nullable=True),
59+
sa.Column('phone', sa.String(), nullable=True),
60+
sa.Column('animal_ids', postgresql.JSONB(astext_type=sa.Text()), nullable=True),
61+
sa.Column('json', postgresql.JSONB(astext_type=sa.Text()), nullable=True),
62+
sa.Column('created_date', sa.DateTime(), nullable=True),
63+
sa.PrimaryKeyConstraint('_id')
64+
)
65+
op.create_table('volgistics',
66+
sa.Column('_id', sa.Integer(), nullable=False),
67+
sa.Column('number', sa.String(), nullable=True),
68+
sa.Column('last_name', sa.String(), nullable=True),
69+
sa.Column('first_name', sa.String(), nullable=True),
70+
sa.Column('middle_name', sa.String(), nullable=True),
71+
sa.Column('complete_address', sa.String(), nullable=True),
72+
sa.Column('street_1', sa.String(), nullable=True),
73+
sa.Column('street_2', sa.String(), nullable=True),
74+
sa.Column('street_3', sa.String(), nullable=True),
75+
sa.Column('city', sa.String(), nullable=True),
76+
sa.Column('state', sa.String(), nullable=True),
77+
sa.Column('zip', sa.String(), nullable=True),
78+
sa.Column('all_phone_numbers', sa.String(), nullable=True),
79+
sa.Column('home', sa.String(), nullable=True),
80+
sa.Column('work', sa.String(), nullable=True),
81+
sa.Column('cell', sa.String(), nullable=True),
82+
sa.Column('email', sa.String(), nullable=True),
83+
sa.Column('json', postgresql.JSONB(astext_type=sa.Text()), nullable=True),
84+
sa.Column('created_date', sa.DateTime(), nullable=True),
85+
sa.PrimaryKeyConstraint('_id')
86+
)
87+
op.create_index('idx_pdp_contacts_source_type_and_id', 'pdp_contacts', ['source_type', 'source_id'], unique=False)
88+
op.create_index(op.f('ix_pdp_contacts_mobile'), 'pdp_contacts', ['mobile'], unique=False)
89+
op.create_index(op.f('idx_pdp_contacts_lower_first_name'), 'pdp_contacts', [sa.text('lower(first_name)')], unique=False)
90+
op.create_index(op.f('idx_pdp_contacts_lower_last_name'), 'pdp_contacts', [sa.text('lower(last_name)')], unique=False)
91+
op.create_index(op.f('idx_pdp_contacts_lower_email'), 'pdp_contacts', [sa.text('lower(email)')], unique=False)
92+
# ### end Alembic commands ###
93+
94+
95+
def downgrade():
96+
# ### commands auto generated by Alembic - please adjust! ###
97+
op.drop_index(op.f('ix_pdp_contacts_lower_email'), table_name='pdp_contacts')
98+
op.drop_index(op.f('ix_pdp_contacts_lower_last_name'), table_name='pdp_contacts')
99+
op.drop_index(op.f('ix_pdp_contacts_lower_first_name'), table_name='pdp_contacts')
100+
op.drop_index(op.f('ix_pdp_contacts_mobile'), table_name='pdp_contacts')
101+
op.drop_index('idx_pdp_contacts_source_type_and_id', table_name='pdp_contacts')
102+
op.drop_table('volgistics')
103+
op.drop_table('shelterluvpeople')
104+
op.drop_table('salesforcecontacts')
105+
op.drop_table('manual_matches')
106+
# ### end Alembic commands ###
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
from api.API_ingest import shelterluv_api_handler
22

3-
def start():
3+
def start(conn):
44
print("Start Fetching raw data from different API sources")
55
#Run each source to store the output in dropbox and in the container as a CSV
6-
shelterluv_api_handler.store_shelterluv_people_all()
6+
shelterluv_api_handler.store_shelterluv_people_all(conn)
77
print("Finish Fetching raw data from different API sources")

src/server/api/API_ingest/shelterluv_api_handler.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
1-
import os
2-
import requests
31
import csv
2+
import os
43
import time
54

6-
from constants import RAW_DATA_PATH
5+
import requests
6+
import pandas as pd
77
from api.API_ingest.dropbox_handler import upload_file_to_dropbox
8+
from constants import RAW_DATA_PATH
9+
from models import ShelterluvPeople
810

911
try:
1012
from secrets_dict import SHELTERLUV_SECRET_TOKEN
@@ -60,7 +62,7 @@ def write_csv(json_data):
6062

6163
''' Iterate over all shelterlove people and store in json file in the raw data folder
6264
We fetch 100 items in each request, since that is the limit based on our research '''
63-
def store_shelterluv_people_all():
65+
def store_shelterluv_people_all(conn):
6466
offset = 0
6567
LIMIT = 100
6668
has_more = True
@@ -90,8 +92,9 @@ def store_shelterluv_people_all():
9092
file_path = write_csv(shelterluv_people)
9193
print("Finish storing latest shelterluvpeople results to container")
9294

93-
9495
print("Start storing " + '/shelterluv/' + "results to dropbox")
9596
upload_file_to_dropbox(file_path, '/shelterluv/' + file_path.split('/')[-1])
9697
print("Finish storing " + '/shelterluv/' + "results to dropbox")
9798

99+
print("Uploading shelterluvpeople csv to database")
100+
ShelterluvPeople.insert_from_df(pd.read_csv(file_path, dtype="string"), conn)

0 commit comments

Comments
 (0)