How to contribute to Geomancer

Is there data that you'd like to add to Geomancer? We built this as an open, extensible platform, so anyone can contribute. If you know how to program in Python, you can adapt Geomancer to your needs using the instructions below. Otherwise, post your request to our Google Group.


Setup

1. Fork & clone Geomancer

First, fork the geomancer repository. This requires a GitHub account.

Then, clone your Geomancer fork:

$ git clone https://github.com/YOURGITHUBUSERNAME/geomancer.git
$ cd geomancer

2. Install requirements

Make sure OS level dependencies are installed:

  • Python 2.7
  • Redis
  • libxml2
  • libxml2-dev
  • libxslt1-dev
  • zlib1g-dev

Install required python libraries. We recommend using virtualenv and virtualenvwrapper for working in a virtualized development environment. Read how to set up virtualenv.

Once you have virtualenv set up:

$ mkvirtualenv geomancer
$ pip install -r requirements.txt

NOTE: Mac users might need this lxml workaround.

Afterwards, whenever you want to work on geomancer using the virtual environment you created:

$ workon geomancer

3. Configure Geomancer

$ cp geomancer/app_config.py.example geomancer/app_config.py
In your newly created app_config.py file, the active data sources are defined by MANCERS. Add any relevant API keys to MANCER_KEYS.

4. Run Geomancer locally

There are three components that should be running simultaneously for the app to work: Redis, the Flask app, and the worker process that appends to the spreadsheets. For debugging purposes, it is useful to run each of these commands in a separate terminal session.

$ redis-server # This command may differ depending on your OS
$ python runworker.py # starts the worker for processing files
$ python runserver.py # starts the web server

Open your browser and navigate to http://localhost:5000

5. Make your changes!

If you'd like to share your work with the rest of the world, submit a pull request with your changes!

Overview of the Extensible Design

Each data source corresponds to a 'mancer' in geomancer/mancers/. For example, the BureauLaborStatistics class in geomancer/mancers/bls.py is the 'mancer' for BLS.

The datasets and columns that are available to a mancer are defined manually. Mancer datasets usually correspond to tables at the source, and mancer columns correspond to the columns within a table.

The methods defined for all mancers are:

  • get_metadata
    This method returns information about the datasets and columns included in the data source, and is used to populate the data sources page. This is a static method.
  • geo_lookup
    Given a search term and a geography type, this method returns a dictionary with the keys term (the original search term) and geoid (the full geographic id to be used by the search method). If need be, this method can look up geographic ids through specific APIs.
  • search
    Given a list of geography ids and a list of columns to append, the search method returns all the data to be appended to the original spreadsheet: the appropriate values for each column & each geography, as well as a header.

Add columns or datasets to an existing data source

Because the datasets and columns are defined manually, and because of the high granularity of available data, the mancers don't include all possible data from a data source. For example, BLS QCEW data is available for a wide range of geographies and many industries, but the BLS mancer currently only has data at the state level and at the highest industry summary level (all industries).

Adding Data

If you're interested in adding data to an existing mancer (say, adding columns with statistics for specific industries to the BLS mancer), all you'll need to do is modify the mancer metadata (defined in the get_metadata method) and ensure that the search method knows how to return the data you've added.

Add a new data source
Each data source corresponds to a 'mancer' in geomancer/mancers/.

1. Create a new class for your data source

Geomancer implements a base class that establishes a pattern for setting up a new data source. In a new .py file in geomancer/mancers/, inherit from the BaseMancer class like so:

from geomancer.mancers.base import BaseMancer

class MyGreatMancer(BaseMancer):

    name = 'My Great Mancer'
    machine_name = 'my_great_mancer'
    base_url = 'http://lotsadata.gov/api'
    info_url = 'http://lotsadata.gov'
    description = 'This is probably the best mancer ever written'

    def get_metadata(self):
        return 'woo'

    def search(self, geo_ids=None, columns=None):
        return 'woo'

    def geo_lookup(self, search_term, geo_type=None):
        return 'woo'

Override the name, machine_name, base_url, info_url, & description properties accordingly.

2. Implement class methods for your mancer

This is the bulk of the work. You will need to implement the get_metadata, search, & geo_lookup methods.

Detailed information about how the responses from these methods should be structured as well as two example mancers can be found in the Github repository.

3. Register your mancer in the application cofiguration

The basic configuration options for Geomancer exist in app_config.py. Add the import path to the module where you wrote your mancer and you should start seeing it as an option when you run the app.

MANCERS = (
    'geomancer.mancers.census_reporter.CensusReporter',
    'geomancer.mancers.usa_spending.USASpending',
    'geomancer.mancers.my_mancer.MyMancer',
)

If your data source requires an API key, add the API key to app_config.py:

MANCER_KEYS = {
    'my_great_mancer' : 'biGl0ngUu1dT4ing',
}

The key (i.e. my_great_mancer) should match the value you used for the machine_name property of your mancer class.

Add a new geography type

Do you have data with geography types that are not currently offered by Geomancer? The geography types are built in an extensible way - each geography is implemented as a GeoType subclass that expects a few static properties to be overridden:

from geomancer.mancers.geotype import GeoType

class WizardSchoolDistrict(GeoType):
    human_name = 'Wizard school district'
    machine_name = 'wizard_school_district'
    formatting_notes = 'Full name of a school district for wizards and witches'
    formatting_example = 'Hogwarts School of Witchcraft and Wizardry'

More details on how to implement a new GeoType can be found here.