leiap.time

This file contains functions related to survey search time calculations

Functions

clean_datetimes(df[, dt_col]) Filter datetimes and correct for timezone issues
correct_timezone(df[, dt_col]) Account for some timezone issues
calc_search_time(df[, dt_col, warn]) Calculate search times in seconds
filter_times(df[, t_col, t_lim, dist_col, …]) Put time and distance restrictions on the time data.
leiap.time.clean_datetimes(df, dt_col='DataDate')[source]

Filter datetimes and correct for timezone issues

Parameters:
  • df (pandas DataFrame) – DataFrame of point observations
  • dt_col (str) – Column name for datetime data
Returns:

df – Identical to input DataFrame with an added ‘dt_adj’ column representing datetimes filtered and correct (see Notes)

Return type:

pandas DataFrame

Notes

1. Points collected in 2014 did not have their times recorded (only dates); they are assigned a time of 00:00:00 by the database. In the new dt_adj column, they are assigned NaT type. If you want to access just the dates, you can do so with the original dt_col (DataDate by default). 2. When points are downloaded from handheld GPS devices, their times are converted to the timezone of the laptop on which they are downloaded. Before we realized this, a lot of points were uploaded on machines set to U.S. Pacific Time. As a result, some times need to be adjusted by 9 hours.

leiap.time.correct_timezone(df, dt_col='DataDate')[source]

Account for some timezone issues

Parameters:
  • df (pandas DataFrame) – DataFrame of point observations
  • dt_col (str) – Column name for datetime data
Returns:

df – Identical to input DataFrame with datetimes fixed so that they all range from 06:30:00-21:30:00. In reality, the latest times are approx 15:00:00

Return type:

pandas DataFrame

leiap.time.calc_search_time(df, dt_col='dt_adj', warn='enable')[source]

Calculate search times in seconds

Parameters:
  • df (pandas DataFrame of points) – Must have columns ‘FieldNumber’, ‘SurveyorName’, ‘SurveyPointId’
  • dt_col (str) – Column name for datetime data
  • warn ({'enable', 'disable'}) – Specify whether or not you want the generic warning message.
Returns:

df – Identical to input DataFrame with added ‘search_time’ and ‘dist’ columns representing search time in seconds, and distance from previous point in meters

Return type:

pandas DataFrame

Notes

This is a naive calculation that doesn’t discard any times. You will want to filter values further before using in any interpretively meaningful way.

leiap.time.filter_times(df, t_col='search_time', t_lim=(1, 900), dist_col='dist', dist_lim=(1, 20))[source]

Put time and distance restrictions on the time data.

Parameters:
  • df (pandas DataFrame) – Dataset of points
  • t_col (str) – Name of column with time info
  • t_lim (tuple) – Min and max limits (inclusive) for time
  • dist_col (str) – Name of column with distance info
  • dist_lim (tuple) – Min and max limits (inclusive) for distance
Returns:

filtered – A subset of the original DataFrame filtered according to the input parameters

Return type:

pandas DataFrame