API

class tripkit.TripKit(config)

The base TripKit object provides an interface for working with .csv data exported from the Itinerum platform or QStarz GPS data loggers. The object is passed the config at initialization which should be an imported Python file of global variables or class with the same attributes as expected here.

This TripKit object is the entry API for loading .csv data to an SQLite database (used as cache), running algorithms on the GPS data, and visualizing or exporting the results to GIS-friendly formats.

The TripKit instance is usually created in your main module like this:

from tripkit import TripKit
import tripkit_config

tripkit = TripKit(config=tripkit_config)
tripkit.setup()
Parameters:config – An imported Python file of global variables or a bare config class with the same attributes, see Generating the Configuration for more information.
check_setup()

Raises exception if UserSurveyResponse table is not found in database.

csv

Provides access to the CSV parser objects.

database

Provides access to the cache database object.

io

Provides access to the file reading and writing functions.

load_user_by_orig_id(orig_id, load_trips=True, start=None, end=None)

Returns all available users as tripkit.models.User objects from the database

Parameters:
  • orig_id – An individual user’s original ID from a non-Itinerum dataset to load
  • load_trips (boolean, optional) – Supply False to disable automatic loading of trips to py:class:tripkit.models.User objects on initialization
  • start (datetime, optional) – Miminum timestamp bounds (inclusive) for loading user coordinate and prompts data
  • end (datetime, optional) – Maximum timestamp bounds (inclusive) for loading user coordinate and prompts data
Return type:

tripkit.models.User

load_users(uuid=None, load_trips=True, load_locations=True, limit=None, start=None, end=None)

Returns all available users as tripkit.models.User objects from the database

Parameters:
  • uuid (string, optional) – Supply an individual user’s UUID to load
  • load_trips (boolean, optional) – Supply False to disable automatic loading of trips to tripkit.models.User objects on initialization
  • load_locations (boolean, optional) – Supple False to disable automatic loading of activity locations to tripkit.models.User objects on initialization
  • limit (integer, optional) – Maximum number of users to load
  • start (datetime, optional) – Mininum timestamp bounds (inclusive) for loading user coordinate and prompts data
  • end (datetime, optional) – Maximum timestamp bounds (inclusive) for loading user coordinate and prompts data
Return type:

list of tripkit.models.User

process

Provides access to the GPS point and trip processing algorithm submodules.

setup(force=False, generate_null_survey=False)

Create the cache database tables if the UserSurveyResponse table does not exist.

Parameters:
  • force (boolean, optional) – Supply True to force creation of a new cache database
  • generate_null_survey (boolean, optional) – Supply True to generate an empty survey responses table for coordinates-only data

CSV Parser

class tripkit.csvparser.ItinerumCSVParser(database)

Parses Itinerum platform csv files and loads to them to a cache database.

Parameters:database – Open Peewee connection the cache database
generate_null_survey(input_dir)

Wrapper function to generate null survey responses for each user in coordinates.

Parameters:input_dir – Directory containing input .csv data
load_export_cancelled_prompt_responses(input_dir)

Loads Itinerum cancelled prompt responses data to the cache database. For each .csv row, the data is fetched by column name if it exists and cast to appropriate types as set in the database.

Parameters:input_dir – The directory containing the self.cancelled_prompt_responses.csv data file.
load_export_coordinates(input_dir)

Loads Itinerum coordinates data to the cache database.

Parameters:input_dir – The directory containing the self.coordinates_csv data file.
load_export_prompt_responses(input_dir)

Loads Itinerum prompt responses data to the cache database. For each .csv row, the data is fetched by column name if it exists and cast to appropriate types as set in the database.

Parameters:input_dir – The directory containing the self.prompt_responses.csv data file.
load_export_survey_responses(input_dir)

Loads Itinerum survey responses data to the cache database.

Parameters:input_dir – The directory containing the self.survey_responses_csv data file.
load_trips(trips_csv_fp)

Loads trips processed by the web platform itself. This is mostly useful for comparing current algorithm results against the deployed platform’s version.

Parameters:trips_csv_fp – The full filepath of the downloaded trips .csv file for a survey.
class tripkit.csvparser.QstarzCSVParser(config, database)

Parses Qstarz csv files and loads to them to a cache database.

Parameters:
  • config
  • database – Open Peewee connection the cache database
  • csv_input_dir – Path to the directory containing the input coordinates .csv data
generate_null_survey(input_dir)

Wrapper function to generate null survey responses for each user in coordinates.

Parameters:input_dir – Directory containing input .csv data
load_export_coordinates(input_dir)

Loads QStarz coordinates data to the cache database.

Parameters:input_dir – The directory containing the self.coordinates_csv data file.
load_user_locations(input_dir)

Loads QStarz user locations data to the cache database.

Parameters:input_dir – The directory containing the self.locations_csv data file.

I/O

class tripkit.io.IO(cfg)
class tripkit.io.CSVIO(cfg)
write_activities_daily(daily_summaries, extra_cols=None, append=False)

Write the user activity summaries by date with a record for each day that a user participated in a survey.

Parameters:
  • daily_summaries (list of dict) – Iterable of user summaries for row records.
  • append (boolean, optional) – Toggles whether summaries should be appended to an existing output file.
write_activity_summaries(summaries, append=False)

Write the activity summary data consisting of complete days and trips tallies with a record per each user for a survey.

Parameters:
  • summaries (list of dict) – Iterable of user summaries for row records
  • append (boolean, optional) – Toggles whether summaries should be appended to an existing output file.
write_complete_days(trip_day_summaries, append=False)

Write complete day summaries to .csv with a record per day per user over the duration of their participation in a survey.

Parameters:
  • trip_day_summaries (list of dict) – Iterable of complete day summaries for each user enumerated by uuid and date.
  • append (boolean, optional) – Toggles whether summaries should be appended to an existing output file.
write_condensed_activity_locations(user, append=True)

Write or append the provided user’s activity locations to file.

Parameters:
  • locations (list of dict) – Iterable of user summaries for row records.
  • append (boolean, optional) – Toggles whether summaries should be appended to an existing output file.
write_condensed_trip_summaries(user, trip_summaries, complete_day_summaries, append=False)

Write the trip summaries with added columns for labeled trip origins/destinations and whether a trip occured on a complete trip day.

Parameters:
  • daily_summaries (list of dict) – Iterable of user summaries for row records.
  • append (boolean, optional) – Toggles whether summaries should be appended to an existing output file.
write_trip_summaries(fn_base, summaries, extra_fields=None, append=False)

Write detected trip summary data to csv consisting of a single record for each trip.

Parameters:
  • fn_base (str) – The base filename to prepend to the output csv file.
  • summaries (list of dict) – Iterable of trip summaries for row records.
  • extra_fields (list, optional) – Additional columns to append to csv (must have matching key in summaries object).
  • append (boolean, optional) – Append data to an existing .csv file.
write_trips(fn_base, trips, extra_fields=None, append=False)

Write detected trips data to a csv file.

Parameters:
  • fn_base (str) – The base filename to prepend to the output csv file
  • trips – Iterable of database trips to write to csv file
  • trips – list of tripkit.models.Trip
class tripkit.io.GeoJSONIO(cfg)
write_activity_locations(fn_base, locations)

Write activity locations (from config or detected) to a geojson file.

Parameters:
  • fn_base (str) – The base filename to prepend to each output geojson file.
  • locations (dict) – A dictionary object of a user’s survey responses containing columns with activity location latitude and longitudes.
write_inputs(fn_base, coordinates, prompts, cancelled_prompts)

Writes input coordinates, prompts and cancelled prompts data selected from cache to individual geojson files.

Parameters:
  • fn_base (str) – The base filename to prepend to each output geojson file.
  • coordinates (list of tripkit.database.Coordinate) – Iterable of database coordinates to write to geojson file.
  • prompts (list of tripkit.database.PromptResponse) – Iterable of database prompts to write to geojson file.
  • cancelled_prompts (list of tripkit.database.CancelledPromptResponse) – Iterable of database cancelled prompts to write to geojson file.
write_mapmatch(fn_base, results)

Writes map matching results from API query to geojson file.

Parameters:
  • fn_base (str) – The base filename to prepend to the output geojson file
  • results – JSON results from map matching API query
write_trips(fn_base, trips)

Writes detected trips data selected from cache to geojson file.

Parameters:
  • fn_base (str) – The base filename to prepend to the output geojson file
  • trips (list of tripkit.models.Trip) – Iterable of database trips to write to geojson file
class tripkit.io.GeopackageIO(cfg)
write_activity_locations(fn_base, locations)

Write activity locations (from config or detected) to a geopackage file.

Parameters:
  • fn_base (str) – The base filename to prepend to each output geopackage file.
  • locations (dict) – A dictionary object of a user’s survey responses containing columns with activity location latitude and longitudes.
write_inputs(fn_base, coordinates, prompts, cancelled_prompts)

Writes input coordinates, prompts and cancelled prompts data selected from cache to individual geopackage files.

Parameters:
  • fn_base (str) – The base filename to prepend to each output geopackage file.
  • coordinates (list of tripkit.database.Coordinate) – Iterable of database coordinates to write to geopackage file.
  • prompts (list of tripkit.database.PromptResponse) – Iterable of database prompts to write to geopackage file.
  • cancelled_prompts (list of tripkit.database.CancelledPromptResponse) – Iterable of database cancelled prompts to write to geopackage file.
write_trips(fn_base, trips)

Writes detected trips data to a geopackage file.

Parameters:
  • fn_base – The base filename to prepend to the output geopackage file
  • trips – Iterable of database trips to write to geopackage file
  • fn_base – str
  • trips – list of tripkit.models.Trip

Database

class tripkit.database.Database(config)

Handles itinerum-tripkit interactions with the cached database using peewee.

bulk_insert(Model, rows, chunk_size=50000)

Bulk insert an iterable of dictionaries into a supplied Peewee model by chunk_size.

Parameters:
  • Model – Peewee database model of target table for inserts.
  • rows (list) – Iterable of dictionaries matching table model for bulk insert.
  • chunk_size (int, optional) – Number of rows to insert per transaction.
clear_trips(user=None)

Clears the detected trip points table or for an individual user.

Parameters:user – (Optional) Delete trips for particular user only.
count_users()

Returns a count of all survey responses in cache database.

create()

Creates all the tables necessary for the itinerum-tripkit cache database.

delete_user_from_table(Model, user)

Deletes a given user’s records from a table in preparation for overwriting.

drop()

Drops all cache database tables.

get_uuid(orig_id)

Returns the database uuid for a user’s original id from a non-Itinerum dataset.

Parameters:orig_id – The original dataset’s user id for an individual user.
load_activity_locations(user)

Queries cache database for activity locations known for a user.

Parameters:user – A database user response record
load_subway_entrances()

Queries cache database for all available subway entrances.

load_trip_day_summaries(user)

Load the daily trip summaries for a given user as dict.

Parameters:user – A database user response record with a populated detected_trip_day_summaries relation.
load_trips(user, start=None, end=None)

Load the sorted trips for a given user as list.

Parameters:user – A database user response record with a populated detected_trip_coordinates relation.
load_user(uuid, start=None, end=None)

Loads user by uuid to an itinerum-tripkit User object.

Parameters:
  • uuid – A individual user’s UUID from within an Itinerum survey.
  • startOptional. Naive datetime object (set within UTC) for selecting a user’s coordinates start period.
  • endOptional. Naive datetime object (set within UTC) for selecting a user’s coordinates end period.
save_trip_day_summaries(user, trip_day_summaries, timezone, overwrite=True)

Saves the daily summaries for detected trip days to cache database. This table with be recreated on each save by default.

Parameters:
  • user (tripkit.models.User) – A database user response record associated with the trip day summaries.
  • trip_day_summaries (list of tripkit.models.DaySummary) – List of daily summaries from a daily trip counts algorithm.
  • timezone (str) – The tz database timezone name for the location that was used to localize the complete days detection.
  • overwrite (boolean, optional) – Provide False to keep existing summaries for user in database.
save_trips(user, trips, overwrite=True)

Saves detected trips from processing algorithms to cache database. This table will be recreated on each save by default.

Parameters:
  • user (tripkit.models.User) – A database user response record associated with the trip records.
  • trips (list of tripkit.models.Trip) – Iterable of detected trips from a trip processing algorithm.