diff --git a/.gitignore b/.gitignore index a5567b3..b523ea5 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,4 @@ /example *__pycache__ +*.DS_Store +*.ipynb diff --git a/AUTHORS.rst b/AUTHORS.rst index 548015b..c829c0d 100644 --- a/AUTHORS.rst +++ b/AUTHORS.rst @@ -6,10 +6,10 @@ Development Lead ---------------- * Toomas Erik Anijärv -* Rory Boyle Contributors ------------ * Jules Mitchell +* Rory Boyle * Cate Scanlon \ No newline at end of file diff --git a/HISTORY.rst b/HISTORY.rst index 3b77a4b..e06e0e3 100644 --- a/HISTORY.rst +++ b/HISTORY.rst @@ -37,4 +37,9 @@ History 0.2.2 (2024-03-3) ------------------ -* Updated documentation \ No newline at end of file +* Updated documentation + +0.2.3 (2024-03-7) +------------------ + +* Added that the plotting functions return matplotlib figure object \ No newline at end of file diff --git a/HLR/__init__.py b/HLR/__init__.py index 2ef03fe..f50c419 100644 --- a/HLR/__init__.py +++ b/HLR/__init__.py @@ -2,7 +2,7 @@ __author__ = """Toomas Erik Anijärv""" __email__ = 'toomaserikanijarv@gmail.com' -__version__ = '0.2.0' +__version__ = '0.2.3' from .diagnostic_tests import (test_durbin_watson, test_pearsons_r, diff --git a/HLR/diagnostic_tests.py b/HLR/diagnostic_tests.py index c277e74..036db38 100644 --- a/HLR/diagnostic_tests.py +++ b/HLR/diagnostic_tests.py @@ -1,11 +1,4 @@ -"""Functions for diagnostic tests to test for assumptions. - -Authors -------- -Toomas Erik Anijärv toomaserikanijarv@gmail.com github.com/teanijarv -Rory Boyle rorytboyle@gmail.com github.com/rorytboyle -""" - +"""Functions for diagnostic tests to test for assumptions.""" import scipy.stats import numpy as np from statsmodels.stats.stattools import durbin_watson diff --git a/HLR/plots.py b/HLR/plots.py index e406a6e..f51e7d0 100644 --- a/HLR/plots.py +++ b/HLR/plots.py @@ -1,3 +1,4 @@ +"""Functions for plotting.""" import pandas as pd import matplotlib.pyplot as plt import seaborn as sns @@ -45,5 +46,4 @@ def create_subplot_partial_regression(model, fig_size, level): sm.graphics.plot_partregress_grid(model, fig=fig) fig.suptitle(f'Partial Regression Plots (Model Level {level})', y=1) - plt.tight_layout() - plt.show() \ No newline at end of file + return fig \ No newline at end of file diff --git a/HLR/regression.py b/HLR/regression.py index c5819f7..d91d14b 100644 --- a/HLR/regression.py +++ b/HLR/regression.py @@ -1,10 +1,4 @@ -"""Functions for running hierarchical linear regression. - -Authors -------- -Toomas Erik Anijärv toomaserikanijarv@gmail.com github.com/teanijarv -Rory Boyle rorytboyle@gmail.com github.com/rorytboyle -""" +"""HLR - Hierarchical Linear Regression.""" import pandas as pd import statsmodels.api as sm import scipy.stats @@ -215,7 +209,11 @@ def diagnostics(self, verbose=True): return diagnostics_results def plot_studentized_residuals_vs_fitted(self): - """Plots studentized residuals against fitted values for all model levels.""" + """Plots studentized residuals against fitted values for all model levels. + + Returns: + matplotlib.figure.Figure: The matplotlib figure object. + """ model_results = self.fit_models() num_levels = len(model_results) @@ -229,9 +227,15 @@ def plot_studentized_residuals_vs_fitted(self): plt.tight_layout() plt.show() + + return fig def plot_qq_residuals(self): - """Plots Normal QQ Plots for all model levels.""" + """Plots Normal QQ Plots for all model levels. + + Returns: + matplotlib.figure.Figure: The matplotlib figure object. + """ model_results = self.fit_models() fig, axs = plt.subplots(len(model_results), 1, figsize=(8, 4 * len(model_results))) if len(model_results) == 1: @@ -243,8 +247,14 @@ def plot_qq_residuals(self): plt.tight_layout() plt.show() + return fig + def plot_influence(self): - """Plots Influence Plots for all model levels.""" + """Plots Influence Plots for all model levels. + + Returns: + matplotlib.figure.Figure: The matplotlib figure object. + """ model_results = self.fit_models() fig, axs = plt.subplots(len(model_results), 1, figsize=(8, 4 * len(model_results))) if len(model_results) == 1: @@ -256,8 +266,14 @@ def plot_influence(self): plt.tight_layout() plt.show() + return fig + def plot_std_residuals(self): - """Plots Box Plots of Standardized Residuals for all model levels.""" + """Plots Box Plots of Standardized Residuals for all model levels. + + Returns: + matplotlib.figure.Figure: The matplotlib figure object. + """ model_results = self.fit_models() fig, axs = plt.subplots(len(model_results), 1, figsize=(8, 4 * len(model_results))) if len(model_results) == 1: @@ -270,9 +286,15 @@ def plot_std_residuals(self): plt.tight_layout() plt.show() + + return fig def plot_histogram_std_residuals(self): - """Plots Histogram of Standardized Residuals for all model levels.""" + """Plots Histogram of Standardized Residuals for all model levels. + + Returns: + matplotlib.figure.Figure: The matplotlib figure object. + """ model_results = self.fit_models() fig, axs = plt.subplots(len(model_results), 1, figsize=(8, 4 * len(model_results))) if len(model_results) == 1: @@ -286,11 +308,21 @@ def plot_histogram_std_residuals(self): plt.tight_layout() plt.show() + return fig + def plot_partial_regression(self): - """Plots Partial Regression Plots for all model levels.""" + """Plots Partial Regression Plots for all model levels. + + Returns: + list: A list of matplotlib.figure.Figure objects, one for each model level. + """ model_results = self.fit_models() num_ivs = max(len(ivs) for ivs in self.models.values()) fig_size = (15, min(10, 5 * num_ivs)) + fig_list = [] for level, model in model_results.items(): - plots.create_subplot_partial_regression(model, fig_size, level) \ No newline at end of file + fig = plots.create_subplot_partial_regression(model, fig_size, level) + fig_list.append(fig) + + return fig_list \ No newline at end of file diff --git a/README.md b/README.md index 034bf08..46d4617 100644 --- a/README.md +++ b/README.md @@ -57,12 +57,12 @@ hreg.summary() hreg.diagnostics(verbose=True) # Different plots (see docs for more) -hreg.plot_studentized_residuals_vs_fitted() -hreg.plot_qq_residuals() -hreg.plot_influence() -hreg.plot_std_residuals() -hreg.plot_histogram_std_residuals() -hreg.plot_partial_regression() +fig1 = hreg.plot_studentized_residuals_vs_fitted() +fig2 = hreg.plot_qq_residuals() +fig3 = hreg.plot_influence() +fig4 = hreg.plot_std_residuals() +fig5 = hreg.plot_histogram_std_residuals() +fig_list = hreg.plot_partial_regression() ``` Output: | | Model Level | Predictors | N (observations) | DF (residuals) | DF (model) | R-squared | F-value | P-value (F) | SSE | SSTO | MSE (model) | MSE (residuals) | MSE (total) | Beta coefs | P-values (beta coefs) | Failed assumptions (check!) | R-squared change | F-value change | P-value (F change) | @@ -129,30 +129,29 @@ Please use Zenodo DOI for citing the package in your work. #### Example -Anijärv, T. E., Mitchell, J. and Boyle, R. (2024) ‘teanijarv/HLR: v0.2.2’. Zenodo. https://doi.org/10.5281/zenodo.7683808 +Anijärv, T. E., Mitchell, J. and Boyle, R. (2024) ‘teanijarv/HLR: v0.2.3’. Zenodo. https://doi.org/10.5281/zenodo.7683808 ``` @software{toomas_erik_anijarv_2024_7683808, author = {Toomas Erik Anijärv, Jules Mitchell, Rory Boyle}, - title = {teanijarv/HLR: v0.2.2}, + title = {teanijarv/HLR: v0.2.3}, month = mar, year = 2024, publisher = {Zenodo}, - version = {v0.2.2}, + version = {v0.2.3}, doi = {10.5281/zenodo.7683808}, url = {https://doi.org/10.5281/zenodo.7683808} } ``` ## Development -HLR was created by [Toomas Erik Anijärv](https://www.toomaserikanijarv.com) using original code by [Rory Boyle](https://github.com/rorytboyle). The package is maintained by Toomas during his spare time, thereby contributions are more than welcome! +The HLR package was created and is maintained by [Toomas Erik Anijärv](https://www.toomaserikanijarv.com). It is updated during spare time, thereby contributions are more than welcome! This program is provided with no warranty of any kind and it is still under development. However, this code has been checked and validated against multiple same analyses conducted in SPSS. #### To-do Would be great if someone with more experience with packages would contribute with testing and the whole deployment process. Also, if someone would want to write documentation, that would be amazing. -- dict valus within df hard to read +- dict values within df hard to read - add t stats for coefficients -- give option for output only some columns not all - add regression type option (eg, for logistic regression) #### Contributors diff --git a/docs/usage.rst b/docs/usage.rst index 3ad5a07..2b69af2 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -102,7 +102,7 @@ Plotting options for all model levels .. code-block:: python - hlr.plot_studentized_residuals_vs_fitted() + fig = hlr.plot_studentized_residuals_vs_fitted() Output: @@ -113,7 +113,7 @@ Output: .. code-block:: python - hlr.plot_qq_residuals() + fig = hlr.plot_qq_residuals() Output: @@ -124,7 +124,7 @@ Output: .. code-block:: python - hlr.plot_influence() + fig = hlr.plot_influence() Output: @@ -135,7 +135,7 @@ Output: .. code-block:: python - hlr.plot_std_residuals() + fig = hlr.plot_std_residuals() Output: @@ -146,7 +146,7 @@ Output: .. code-block:: python - hlr.plot_histogram_std_residuals() + fig = hlr.plot_histogram_std_residuals() Output: @@ -157,7 +157,7 @@ Output: .. code-block:: python - hlr.plot_partial_regression() + fig_list = hlr.plot_partial_regression() Output: @@ -166,4 +166,4 @@ Output: :align: center :width: 50% -(only Model Level 1 displayed, but actual output would plot all levels) \ No newline at end of file +(the fig_list contains a fig for each Model Level; only Model Level 1 displayed (i.e., fig_list[0])) \ No newline at end of file diff --git a/setup.py b/setup.py index cadbfb9..f2906ce 100644 --- a/setup.py +++ b/setup.py @@ -44,6 +44,6 @@ test_suite='tests', tests_require=test_requirements, url='https://github.com/teanijarv/HLR', - version='0.2.2', + version='0.2.3', zip_safe=False, )