Data Extraction (swimrs.data_extraction)¶
Earth Engine exports and meteorology ingest helpers.
Earth Engine¶
export_ptjpl_zonal_stats(shapefile: str, bucket: str, feature_id: str = 'FID', select: list[str] | None = None, start_yr: int = 2000, end_yr: int = 2024, mask_type: str = 'no_mask', check_dir: str | None = None, state_col: str = 'state', buffer: float | None = None, batch_size: int = 60, file_prefix: str = 'swim') -> None
¶
Export per-scene PT-JPL ET fraction zonal means for polygons to GCS CSVs.
Parameters¶
shapefile : str
Path to polygon shapefile with feature IDs.
bucket : str
GCS bucket name (no scheme).
feature_id : str, optional
Field name for feature identifier.
select : list, optional
Optional list of feature IDs to process.
start_yr : int, optional
Inclusive start year (default: 2000).
end_yr : int, optional
Inclusive end year (default: 2024).
mask_type : {'no_mask', 'irr', 'inv_irr'}, optional
Irrigation masking strategy (default: 'no_mask').
check_dir : str, optional
If set, skip exports when CSV already exists at check_dir/
export_ssebop_zonal_stats(shapefile: str, bucket: str, feature_id: str = 'FID', select: list[str] | None = None, start_yr: int = 2000, end_yr: int = 2024, mask_type: str = 'no_mask', check_dir: str | None = None, state_col: str = 'state', buffer: float | None = None, batch_size: int = 15, file_prefix: str = 'swim') -> None
¶
Export per-scene SSEBop ET fraction zonal means for polygons to GCS CSVs.
Parameters¶
shapefile : str
Path to polygon shapefile with feature IDs.
bucket : str
GCS bucket name (no scheme).
feature_id : str, optional
Field name for feature identifier.
select : list, optional
Optional list of feature IDs to process.
start_yr : int, optional
Inclusive start year (default: 2000).
end_yr : int, optional
Inclusive end year (default: 2024).
mask_type : {'no_mask', 'irr', 'inv_irr'}, optional
Irrigation masking strategy (default: 'no_mask').
check_dir : str, optional
If set, skip exports when CSV already exists at check_dir/
export_sims_zonal_stats(shapefile: str, bucket: str, feature_id: str = 'FID', select: list[str] | None = None, start_yr: int = 2000, end_yr: int = 2024, mask_type: str = 'no_mask', check_dir: str | None = None, state_col: str = 'state', buffer: float | None = None, batch_size: int = 15, file_prefix: str = 'swim') -> None
¶
Export per-scene SIMS ET fraction zonal means for polygons to GCS CSVs.
Parameters¶
shapefile : str
Path to polygon shapefile with feature IDs.
bucket : str
GCS bucket name (no scheme).
feature_id : str, optional
Field name for feature identifier.
select : list, optional
Optional list of feature IDs to process.
start_yr : int, optional
Inclusive start year (default: 2000).
end_yr : int, optional
Inclusive end year (default: 2024).
mask_type : {'no_mask', 'irr', 'inv_irr'}, optional
Irrigation masking strategy (default: 'no_mask').
check_dir : str, optional
If set, skip exports when CSV already exists at check_dir/
export_geesebal_zonal_stats(shapefile: str, bucket: str, feature_id: str = 'FID', select: list[str] | None = None, start_yr: int = 2000, end_yr: int = 2024, mask_type: str = 'no_mask', check_dir: str | None = None, state_col: str = 'state', buffer: float | None = None, batch_size: int = 15, file_prefix: str = 'swim') -> None
¶
Export per-scene geeSEBAL ET fraction zonal means for polygons to GCS CSVs.
Parameters¶
shapefile : str
Path to polygon shapefile with feature IDs.
bucket : str
GCS bucket name (no scheme).
feature_id : str, optional
Field name for feature identifier.
select : list, optional
Optional list of feature IDs to process.
start_yr : int, optional
Inclusive start year (default: 2000).
end_yr : int, optional
Inclusive end year (default: 2024).
mask_type : {'no_mask', 'irr', 'inv_irr'}, optional
Irrigation masking strategy (default: 'no_mask').
check_dir : str, optional
If set, skip exports when CSV already exists at check_dir/
GridMET / ERA5¶
GridMet
¶
Bases: Thredds
U of I Gridmet
Return as numpy array per met variable in daily stack unless modified.
['bi', 'elev', 'erc', 'fm100', fm1000', 'pdsi', 'pet', 'pr', 'rmax', 'rmin', 'sph', 'srad',
'th', 'tmmn', 'tmmx', 'vs']
----------
Observation elements to access. Currently available elements:
- 'bi' : burning index [-]
- 'elev' : elevation above sea level [m]
- 'erc' : energy release component [-]
- 'fm100' : 100-hour dead fuel moisture [%]
- 'fm1000' : 1000-hour dead fuel moisture [%]
- 'pdsi' : Palmer Drough Severity Index [-]
- 'pet' : daily reference potential evapotranspiration [mm]
- 'pr' : daily accumulated precipitation [mm]
- 'rmax' : daily maximum relative humidity [%]
- 'rmin' : daily minimum relative humidity [%]
- 'sph' : daily mean specific humidity [kg/kg]
- 'prcp' : daily total precipitation [mm]
- 'srad' : daily mean downward shortwave radiation at surface [W m-2]
- 'th' : daily mean wind direction clockwise from North [degrees]
- 'tmmn' : daily minimum air temperature [K]
- 'tmmx' : daily maximum air temperature [K]
- 'vs' : daily mean wind speed [m -s]
:param start: start of period of data, datetime.datetime object or string format 'YYY-MM-DD' :param end: end of period of data, datetime.datetime object or string format 'YYY-MM-DD' :param variables: List of available variables. At lease one. :param date: date of data, datetime.datetime object or string format 'YYY-MM-DD' :param bbox: bounds.GeoBounds object representing spatial bounds :return: numpy.ndarray
Must have either start and end, or date. Must have at least one valid variable. Invalid variables will be excluded gracefully.
NetCDF dates are in xl '1900' format, i.e., number of days since 1899-12-31 23:59
xlrd.xldate handles this for the time being
date = date
instance-attribute
¶
start = start
instance-attribute
¶
end = end
instance-attribute
¶
variable = variable
instance-attribute
¶
bbox = bbox
instance-attribute
¶
target_profile = target_profile
instance-attribute
¶
clip_feature = clip_feature
instance-attribute
¶
lat = lat
instance-attribute
¶
lon = lon
instance-attribute
¶
service = 'thredds.northwestknowledge.net:8080'
instance-attribute
¶
scheme = 'http'
instance-attribute
¶
temp_dir = mkdtemp()
instance-attribute
¶
available = ['elev', 'pr', 'rmax', 'rmin', 'sph', 'srad', 'th', 'tmmn', 'tmmx', 'pet', 'vs', 'erc', 'bi', 'fm100', 'pdsi']
instance-attribute
¶
kwords = {'bi': 'daily_mean_burning_index_g', 'elev': '', 'erc': 'energy_release_component-g', 'fm100': 'dead_fuel_moisture_100hr', 'fm1000': 'dead_fuel_moisture_1000hr', 'pdsi': 'daily_mean_palmer_drought_severity_index', 'etr': 'daily_mean_reference_evapotranspiration_alfalfa', 'pet': 'daily_mean_reference_evapotranspiration_grass', 'pr': 'precipitation_amount', 'rmax': 'daily_maximum_relative_humidity', 'rmin': 'daily_minimum_relative_humidity', 'sph': 'daily_mean_specific_humidity', 'srad': 'daily_mean_shortwave_radiation_at_surface', 'th': 'daily_mean_wind_direction', 'tmmn': 'daily_minimum_temperature', 'tmmx': 'daily_maximum_temperature', 'vs': 'daily_mean_wind_speed', 'vpd': 'daily_mean_vapor_pressure_deficit'}
instance-attribute
¶
single_year = False
instance-attribute
¶
__init__(variable: str | None = None, date=None, start=None, end=None, bbox=None, target_profile=None, clip_feature=None, lat: float | None = None, lon: float | None = None) -> None
¶
subset_daily_tif(out_filename: str | None = None) -> np.ndarray
¶
subset_nc(out_filename: str | None = None, return_array: bool = False)
¶
get_point_timeseries() -> DataFrame
¶
Retrieve daily time series for a point location.
Downloads meteorological data for the point specified by lat/lon coordinates over the date range defined at initialization.
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with datetime index and single column for the variable. |
Example
gm = GridMet(variable='etr', lat=45.5, lon=-116.5, ... start='2020-01-01', end='2020-12-31') df = gm.get_point_timeseries() print(df.head())