Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
b752705
moving Geography into models module; trimming api keys
jjackson12 Jun 20, 2023
5c2edb8
removing broken tests
jjackson12 Jun 20, 2023
5ba1dcc
templating get_elections
jjackson12 Jun 20, 2023
fab8491
templating elections more
jjackson12 Jun 20, 2023
eede1f6
passed tests for url building, api query
KathrynPanger Jul 5, 2023
8a0ce89
basic search function added for census tables
KathrynPanger Jul 5, 2023
89cfdd3
search function now includes variable (group level) names
KathrynPanger Jul 6, 2023
bd366a0
search function cleaned up
KathrynPanger Jul 6, 2023
2aa26aa
clean up comments
KathrynPanger Jul 6, 2023
dd8e25c
can now search variables without search string or number of results
KathrynPanger Jul 21, 2023
56680bf
query now possible with nested results
KathrynPanger Jul 25, 2023
75b757f
groups now accepted in api calls
KathrynPanger Jul 26, 2023
181bbec
repaired other tests to fit new geography structure
KathrynPanger Jul 26, 2023
862accb
dataframe creation under ideal conditions, functionality added
KathrynPanger Jul 26, 2023
55a2331
now possible to pull from more geographies with more unconventional a…
KathrynPanger Jul 26, 2023
bb63a0e
added examples
KathrynPanger Jul 31, 2023
23f9cac
fixed broken test, argument wasn't accepting groups
KathrynPanger Jul 31, 2023
8bbec97
started elections infrustructure
KathrynPanger Aug 1, 2023
7ea67a3
threading, but not grabbing everything yet
KathrynPanger Aug 14, 2023
ed40d8d
api functionality, with authentication
KathrynPanger Aug 24, 2023
e0f0639
can now download all csvs, also refactored
KathrynPanger Aug 26, 2023
3de666d
tool to collect all csv urls added,
KathrynPanger Aug 26, 2023
3277b7c
sql command functionality
KathrynPanger Sep 2, 2023
511f3c6
changed test main
KathrynPanger Sep 7, 2023
96a6ee2
Merge branch 'bluebonnet-data:master' into master
KathrynPanger Sep 7, 2023
0c85b39
cleaning up google cloud interface
KathrynPanger Dec 4, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ __pycache__
.vscode
.tox
.python-version
.DS_Store

# Sublime environment
*.sublime-project
Expand All @@ -18,6 +19,8 @@ build

# User files
user
/local_files
/cached_csvs

# Shapefiles
*.dbf
Expand All @@ -31,6 +34,11 @@ user
*.json
*.html
*.csv
*.txt

# Shh, secrets
census_api_key.txt

#api key
config.py
*config.py
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions .idea/censusExplorer.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,14 @@ For our working example, we'll use median household income (which is coded in th
We can simply downloaded the census data like so:

```python
>>> from bbd import census
>>> data = census.get_acs(
>>> geography=census.Geography.CD,
>>> variables="group(DP03),NAME",
>>> year=2018,
>>> state="co",
>>> dataset=census.DataSets.ACS5_PROFILE,
>>> )
>> > from bbd import census
>> > data = census.get_acs(
>> > geography = census.GEOGRAPHY.CD,
>> > variables = "group(DP03),NAME",
>> > year = 2018,
>> > state = "co",
>> > dataset = census.DataSet.ACS5_PROFILE,
>> > )
```

https://api.census.gov/data/2018/acs/acs1/profile?get=group(DP03),NAME&for=congressional%20district:*&in=state:08
Expand Down Expand Up @@ -86,12 +86,12 @@ When you request it from the API with `get_acs`, it is automatically converted t
Luckily, the census provides shapefiles for pretty much every `GEO_ID` you can find in the census API. You can automatically download them like this:

```python
>>> shapefile_path = census.get_shapefile(
>>> geography=census.Geography.CD,
>>> state="co",
>>> year=2018,
>>> cache=True,
>>> )
>> > shapefile_path = census.get_shapefile(
>> > geography = census.GEOGRAPHY.CD,
>> > state = "co",
>> > year = 2018,
>> > cache = True,
>> > )
```

To get these files manually, just head to [this website](https://www.census.gov/cgi-bin/geo/shapefiles/index.php) and select the relevent geography/location.
Expand Down
2 changes: 1 addition & 1 deletion examples/co_income.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
geography=census.Geography.TRACT,
variables=["NAME", "DP03_0062E"],
year=2018,
dataset=census.DataSets.ACS5_PROFILE,
dataset=census.DataSet.ACS5_PROFILE,
state="co",
county="069", # Larimer County
cache=True,
Expand Down
37 changes: 37 additions & 0 deletions examples/create_census_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from bbd.census import Census
from bbd.census.geographic_unit import GeographicUnit
# HOW TO CREATE A CENSUS OBJECT FOR PERFORMING API PULLS
# 1. Save the API key securely to a variable
# 2. Create GeographyUnit objects
# 3. Create the object

###########################
# Import API key from file
###########################
api_key_file = Path(__file__).parent.absolute() / "census_api_key.txt"
with open(api_key_file, "r") as f:
API_KEY = f.readlines()[0]

##############################
# Create GeographyUnit Objects
##############################
# First, decide on the unit of analysis your final dataset will have.
# This will determine the setup for your GeographyUnit objects
# GeographyUnits take in two arguments: a geography, and an analysis level
# the analysis level indicates whether you want data "for" all locations of this type,
# or if this geography represents the top-level; the place you want to find data "in". You can also
# use "none_level" for api-pulls which require unorthodox arguments, which we will discuss in a future guide.

# Example: geographic units targeting data for all counties in the state of New York (36 = Census code for New York)
county = GeographicUnit(analysis_level=AnalysisLevel.FOR, geography=Geography.COUNTY, value="*")
state = GeographicUnit(analysis_level=AnalysisLevel.IN, geography=Geography.STATE, value="36")
geographic_units = [state, county]

########################
# Create a Census Object
########################
# Specify the year and dataset you want.
year = 2019
dataset = DataSet.ACS1
# Create the census object.
census = Census(api_key=api_key, geographic_units=geographic_units, year=year, dataset=dataset)
2 changes: 1 addition & 1 deletion examples/get_acs_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
geography=census.Geography.STATE,
variables="NAME,B03003_001E",
year=2018,
dataset=census.DataSets.ACS5_DETAIL,
dataset=census.DataSet.ACS5,
)

pprint(data)
33 changes: 33 additions & 0 deletions examples/get_data_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@


# HOW TO PULL ACS DATA FROM A CENSUS OBJECT
# 1. Create a census object (see create_census_example.py for instructions)
# 2. Pull the data, as shown below

#Create a census object (see instructions in create_census_example.py)
api_key_file = Path(__file__).parent.absolute() / "census_api_key.txt"
with open(api_key_file, "r") as f:
API_KEY = f.readlines()[0]

county = GeographicUnit(analysis_level=AnalysisLevel.FOR, geography=Geography.COUNTY, value="*")
state = GeographicUnit(analysis_level=AnalysisLevel.IN, geography=Geography.STATE, value="36")
geographic_units = [state, county]
year = 2019
dataset = DataSet.ACS1
census = Census(api_key=API_KEY, geographic_units=geographic_units, year=year, dataset=dataset)


###############
# Pull the data
###############
# Specify any census variables you want.
variables = ["NAME", "B01001_001E"]
# Specify any census groups you want; results will include all variables belonging to that group.
groups = ["B02015"]
# pull the data for those variables
result = census.get_data(variables = variables, groups = groups)
# The result will also be stored in census.results, which is a list of CensusResult objects created during this session.
# Get the dataframe out of the result object.
df = result.dataframe
print(df.head())

30 changes: 30 additions & 0 deletions examples/search_variables_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from bbd.census import Census

# HOW TO FIND THE CENSUS CODES FOR THE VARIABLES AND GROUPS YOU WANT TO EXAMINE
#Create a census object (see instructions in create_census_example.py)

api_key_file = Path(__file__).parent.absolute() / "census_api_key.txt"
with open(api_key_file, "r") as f:
API_KEY = f.readlines()[0]

county = GeographicUnit(analysis_level=AnalysisLevel.FOR, geography=Geography.COUNTY, value="*")
state = GeographicUnit(analysis_level=AnalysisLevel.IN, geography=Geography.STATE, value="36")
geographic_units = [state, county]
year = 2019
dataset = DataSet.ACS1
census = Census(api_key=API_KEY, geographic_units=geographic_units, year=year, dataset=dataset)

# Get a dataframe of all available variables in no particular order.
df = census.search_variables()
print(df)

# Get a dataframe of all available variables sorted by similarity of description to a given search string.
search_string = "female household"
df = census.search_variables(search_string = search_string)
print(df)

# Get a dataframe of variables sorted by similarity to a search string, limiting the number of results.
search_string = "housing cost"
number_of_results = 10
df = census.search_variables(search_string = search_string, number_of_results = number_of_results)
print(df)
2 changes: 1 addition & 1 deletion examples/tx_hispanic_or_latino.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
geography=census.Geography.BLOCKGROUP,
variables=["NAME", "B03003_001E", "B03003_002E", "B03003_003E"],
year=2018,
dataset=census.DataSets.ACS5_DETAIL,
dataset=census.DataSet.ACS5,
state="tx",
county="201", # Harris County
cache=True,
Expand Down
4 changes: 2 additions & 2 deletions examples/tx_zip_code_by_race.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

1) Make sure python is installed

2) Install the bluebonnet data package by running this command in your terminal:
2) Install the cloud_storage data package by running this command in your terminal:
pip install bbd

3) Run this python file with the following command:
Expand Down Expand Up @@ -56,7 +56,7 @@
geography=census.Geography.ZCTA,
variables=list(variables.keys()),
year=2018,
dataset=census.DataSets.ACS5_DETAIL,
dataset=census.DataSet.ACS5,
# state="tx",
# county="201": "Harris County
cache=True,
Expand Down
Loading