regression.py¶
-
lib.regression.
check_regression_model
(paths, tech)¶ This function checks the regression model parameters for nan values, and returns the FLH and TS model dataframes. If missing values are present in the input CSV files, the users are prompted if they wish to continue or can modify the corresponding files.
- Parameters
paths (dict) – Dictionary of dictionaries containing the paths to the FLH and TS model regression CSV files.
tech (str) – Technology under study.
- Return (FLH, TS_reg)
Tuple of pandas dataframes for FLH and TS.
- Return type
Tuple of pandas dataframes
-
lib.regression.
clean_FLH_regression
(paths, param)¶ This function creates a CSV file containing the model FLH used for regression. If the region is present in the IRENA database, then the FLH are extracted directly from there. In case it is not present, a place holder for the regions is written in the csv file and it is the user’s responsibility to fill in an appropriate value. The function will warn the user, and print all regions that are left blank.
- Parameters
param (dict) – Dictionary of dictionaries containing the list of regions.
paths (dict) – Dictionary of dictionaries containing the paths to IRENA_summary, IRENA_dict.
- Return missing
List of string of the missing regions. The CSV file for the the FLH needed for the regression is saved directly in the given path, along with the corresponding metadata in a JSON file.
- Return type
list of str
- Raises
Missing Regions – No FLH values exist for certain regions.
-
lib.regression.
clean_TS_regression
(paths, param, tech)¶ This function creates a CSV file containing the model time series used for regression. If the region is present in the EMHIRES text files then the TS is extracted directly from it. If the region is not present in the EMHIRES text files, the highest FLH generated TS is used instead and is scaled to match IRENA FLH if the IRENA FLH are available.
- Parameters
paths (dict) – Dictionary containing paths to EMHIRES text files.
param (dict) – Dictionary containing the FLH_regression dataframe, list of subregions contained in shapefile, and year.
- Returns
The time series used for the regression are saved directly in the given path, along with the corresponding metadata in a JSON file.
- Return type
None
- Raises
Missing FLH – FLH values are missing for at least one region. No scaling is applied to the time series for those regions.
Missing EMHIRES – EMHIRES database is missing, generated timeseries will be used as model for all regions.
-
lib.regression.
combinations_for_regression
(paths, param, tech)¶ This function reads the list of generated time series for different hub heights and orientations, compares it to the user-defined combinations and returns a list of lists containing all the available combinations. The function will return a warning if the user input and the available time series are not congruent.
- Parameters
paths (dict) – Dictionary of dictionaries containing the paths to the regional analysis output folder.
param (dict) – Dictionary of dictionaries containing the subregions name, year, and user-defined combinations.
tech (str) – Technology under study.
- Return combinations
List of combinations for regression.
- Return type
list
- Raises
missing data – If no time series are available for this technology, a warning is raised.
missing combination – If a hub height or orientation is missing based on user-defined combinations, a warning is raised.
-
lib.regression.
get_regression_coefficients
(paths, param, tech)¶ This function solves the following optimization problem: A combination of quantiles, hub heights or orientations is to be found, so that the error to a given historical time series (e.g. from EMHIRES for European countries) is minimized, while constraining the FLH to match a given value (for example from IRENA). The settings of the combinations can be defined by the user.
The function starts by identifying the existing settings (hub heights, orientations) and quantiles. If the combinations of time series requested by the user cannot be found, a warning is raised.
It later runs the optimization and identifies the subregions for which a solution was found. If the optimization is infeasible (too high or too low FLH values compared to the reference to be matched), the time series with the closest FLH to the reference value is used in the final output.
The output consists of coefficients between 0 and 1 that could be multiplied later with the individual time series in
time_series.generate_stratified_timeseries
. The sum of the coefficients for each combination is equal to 1.- Parameters
paths (dict) – Dictionary including the paths to the time series for each subregion, technology setting, and quantile, to the output paths for the coefficients.
param (dict) – Dictionary including the dictionary of regression parameters, quantiles, and year.
tech (str) – Technology under study.
- Returns
The regression parameters (e.g. IRENA FLH and EMHIRES TS) are copied under regression_in folder, and the regression coefficients are saved in a CSV file under regression_out folder, along with the metadata in a JSON file.
- Return type
None
- Raises
Missing Data – No time series present for technology tech.
Missing Data for Setting – Missing time series for desired settings (hub heights / orientations).
-
lib.regression.
pyomo_regression_model
()¶ This function returns an abstract pyomo model of a constrained least square problem for time series fitting to match model FLHs and minimize difference error with model time series.
- Return model
Abstract pyomo model.
- Return type
pyomo object
-
lib.regression.
read_generated_TS
(paths, param, tech, settings, subregion)¶ This function returns a dictionary containing the available time series generated by the script based on the desired technology and settings.
- Parameters
paths (dict) – Dictionary including output folder for regional analysis.
param (dict) – Dictionary including list of subregions and year.
tech (str) – Technology under study.
settings – List of lists containing setting combinations.
subregion (str) – Name of the subregion.
- Return GenTS
Dictionary of time series indexed by setting and quantile.
- Return type
dict
-
lib.regression.
regmodel_load_data
(paths, param, tech, settings, subregion)¶ This function returns a dictionary used to initialize a pyomo abstract model for the regression analysis of each region.
- Parameters
paths (dict) – Dictionary of dictionaries containing the paths to the CSV time series files.
param (dict) – Dictionary of dictionaries contating IRENA’s region list, FLHs and EMHIRES model timeseries.
tech (str) – Technology under study.
settings (list) – List of all the settings (hub heights/orientations) to be used in the regression.
subregion (str) – Name of subregion.
- Return data
Dictionary containing regression parameters.
- Return type
dict