leiap.io

This file contains functions related to I/O from the database

Functions

get_credentials([credentials_path]) Access database credentials from JSON file
connect2db([driver]) Open a connection to the database
db_query(query_text, **kwargs) Send any SQL query to the database
get_points([years]) Load a DataFrame of points
get_points_simple(**kwargs) Load a DataFrame of points with the most typical query
get_points_by_year(years, **kwargs) Load a DataFrame of points with the most typical query for specified year(s)
get_artifacts([sections, years, …]) Load a DataFrame of artifacts
get_artifacts_by_year(years[, discards]) Load a DataFrame of artifacts with the most typical query for specified year(s)
get_artifacts_simple([include_discards]) Load a DataFrame of artifacts with the most typical query
get_productions_simple(**kwargs) Load a DataFrame of productions with the most typical query
get_production_cts_wts(**kwargs) Load a DataFrame of all points with columns for counts and weights of all productions
get_points_times([warn]) Load a DataFrame of points with datetimes cleaned and search times calculated
leiap.io.get_credentials(credentials_path='credentials.json')[source]

Access database credentials from JSON file

Parameters:credentials_path (str) – Location of the credentials JSON file
Returns:credentials – Dictionary containing the database connection details
Return type:dict
leiap.io.connect2db(driver='{ODBC Driver 17 for SQL Server}', **kwargs)[source]

Open a connection to the database

Parameters:
  • driver (str, optional) – Database driver needed to connect
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

connection

Return type:

MS SQL database connection

leiap.io.db_query(query_text, **kwargs)[source]

Send any SQL query to the database

Parameters:
  • query_text (str) – Full SQL query to pass to the database
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

df – DataFrame of query results

Return type:

pandas DataFrame

leiap.io.get_points(years=None, **kwargs)[source]

Load a DataFrame of points

Parameters:
  • years (list) – List of desired years; can be strings or integers
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

points_df – DataFrame of all points

Return type:

pandas DataFrame

leiap.io.get_points_simple(**kwargs)[source]

Load a DataFrame of points with the most typical query

Parameters:**kwargs – Optional arguments that are passed to get_credentials()
Returns:points_df – DataFrame of all points
Return type:pandas DataFrame
leiap.io.get_points_by_year(years, **kwargs)[source]

Load a DataFrame of points with the most typical query for specified year(s)

Parameters:
  • years (list) – List of desired years; can be strings or integers
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

points_df – DataFrame of all points for specified year(s)

Return type:

pandas DataFrame

leiap.io.get_artifacts(sections=['base'], years=None, include_discards=False, **kwargs)[source]

Load a DataFrame of artifacts

Parameters:
  • sections (list of some set of) – {‘all’, ‘base’, ‘metrics’, ‘classify’, ‘production’, ‘tile_brick’, ‘waretypes’, ‘vesselparts’, ‘macro_fabric’} Sections to include in the output DataFrame. Each section refers to a group of column names.
  • years (list) – List of desired years; can be strings or integers
  • include_discards (bool) – If True, return all records, even artifacts marked as Discarded.
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

artifacts_df – DataFrame of all artifacts

Return type:

pandas DataFrame

leiap.io.get_artifacts_simple(include_discards=False, **kwargs)[source]

Load a DataFrame of artifacts with the most typical query

Parameters:
  • include_discards (bool) – If True, return all records, even artifacts marked as Discarded.
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

artifacts_df – DataFrame of all artifacts

Return type:

pandas DataFrame

leiap.io.get_artifacts_by_year(years, discards=False, **kwargs)[source]

Load a DataFrame of artifacts with the most typical query for specified year(s)

Parameters:
  • years (list) – List of desired years; can be strings or integers
  • discards (bool) – If True, return all records, even artifacts marked as Discarded.
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

artifacts_df – DataFrame of all artifacts for specified year(s)

Return type:

pandas DataFrame

leiap.io.get_productions_simple(**kwargs)[source]

Load a DataFrame of productions with the most typical query

Parameters:**kwargs – Optional arguments that are passed to get_credentials()
Returns:prods_df – DataFrame of all productions
Return type:pandas DataFrame
leiap.io.get_production_cts_wts(**kwargs)[source]

Load a DataFrame of all points with columns for counts and weights of all productions

Parameters:**kwargs – Optional arguments that are passed to get_credentials()
Returns:cts_wts – DataFrame of all points with counts and weights for all productions
Return type:pandas DataFrame

Notes

Also pulls in some non-vessel artifact types (e.g., tile, brick, other construction material)

leiap.io.get_points_times(warn='enable', **kwargs)[source]

Load a DataFrame of points with datetimes cleaned and search times calculated

Parameters:
  • warn ({'enable', 'disable'}) – Argument passed to the calc_search_time() function specifying whether or to print a generic warning message.
  • **kwargs – Optional arguments that are passed to get_credentials()
Returns:

pts – DataFrame of all points with adjusted datetimes and search times

Return type:

pandas DataFrame