need_functions

Python package for working with the National Energy Efficiency Data-Framework (NEED) dataset

Overview

The National Energy Efficiency Data-Framework (NEED) dataset is published by the UK Government here: https://www.gov.uk/government/collections/national-energy-efficiency-data-need-framework

This Python package contains a series of Python functions which can be used to:

Download the CSV files which make up the NEED dataset.
Import the data into a SQLite database.
Access the data in the database, for data analysis and visualisation.

A description of the CSV files in the NEED dataset, along with instructions for downloading and importing in SQLite, has been created using the format of a CSVW metadata Table Group object (saved as a JSON file), which is available here: https://github.com/building-energy/need_functions/blob/main/need_tables-metadata.json

Installation

pip install git+https://github.com/building-energy/need_functions

The python package csvw_functions_extra will also need to be installed.

Quick Start

import need_functions

# download all NEED data and import into database.
need_functions.download_and_import_all_data()

# get the distribution of property types in the data.
result=need_functions.get_distribution(
            'PROP_TYPE',
            'need_2021_anon_dataset_4million'
            )
print(result)

[
        {'PROP_TYPE': 'Bungalow', 'COUNT': 359717}, 
        {'PROP_TYPE': 'Detached', 'COUNT': 622875}, 
        {'PROP_TYPE': 'End terrace', 'COUNT': 358170}, 
        {'PROP_TYPE': 'Flat', 'COUNT': 862623}, 
        {'PROP_TYPE': 'Mid terrace', 'COUNT': 773569}, 
        {'PROP_TYPE': 'Semi detached', 'COUNT': 1023046}
]

API

get_available_csv_file_names

Description: Returns the CSV file names of all tables in the CSVW metadata file.

need_functions.get_available_csv_file_names()

Returns: A list of the https://purl.org/berg/csvw_functions_extra/vocab/csv_file_name value in each table.

download_and_import_data

Description: Downloads all the NEED data and imports all data into a SQLite database.

need_functions.download_and_import_all_data(
        csv_file_names = None,
        data_folder='_data',
        database_name='need_data.sqlite',
        verbose=False,
        )

The data to be downloaded is described in the CSVW metadata file here: https://raw.githubusercontent.com/building-energy/snecs_functions/main/snecs_tables-metadata.json

Running this function will:

create the data_folder if it does not already exist.
download the CSV files to the data folder.
download the CSVW metadata file to the data folder. This is named 'need_tables-metadata.json'.
create a SQLite database named database_name in the data folder if it does not already exist.
import the CSV data into the SQLite database.

Arguments:

csv_file_names (str, list or None): The CSV file name(s) to download and import (see get_available_csv_file_names). None will download the entire dataset.
data_folder (str): The filepath of a local folder where the downloaded CSV data is saved to and the SQLite database is stored.
database_name (str): The name of the SQLite database, relative to the data_folder.
verbose (bool): If True, then this function prints intermediate variables and other useful information.

Returns: None

get_need_table_names_in_database

Description: Returns the table names of all NEED tables in the SQLite database.

get_need_table_names_in_database(
        data_folder = '_data',
        database_name = 'need_data.sqlite',
        )

data_folder (str): The filepath of a local folder where the SQLite database is stored.
database_name (str): The name of the SQLite database, relative to the data_folder.

Returns: A list of table names.

get_metadata_columns_codes

Description: Returns lookup dictionaries for the NEED lookup codes for one or more columns.

need_functions.get_metadata_columns_codes(
        column_names,
        sql_table_name,
        data_folder = '_data',
        )

Arguments:

column_name (str): The name value of a column in a CSVW TableSchema object.
sql_table_name (str): The https://purl.org/berg/csvw_functions_extra/vocab/sql_table_name value of the table.
data_folder (str): The filepath of a local folder where the normalized CSVW metadata file is saved.

Returns (dict of dicts): A dictionary with:

keys: the names of the column(s)
values: a dictionary with keys as lookup codes and values as code descriptions.

get_distribution

Description: Returns the distribution of values in a column.

need_functions.get_distribution(
        field,
        sql_table_name,
        data_folder = '_data',
        database_name = 'need_data.sqlite',
        verbose=False
        )

Arguments:

field (str): A field in the database table.
sql_table_name (str): The name of the database table.
data_folder (str): The filepath of a local folder where the SQLite database is stored.
database_name (str): The name of the SQLite database, relative to the data_folder.

Returns (list): A list of dictionaries, where each dictionary is a set of results with a unique value and a count of those unique values.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
need_functions		need_functions
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
need_tables-metadata.json		need_tables-metadata.json
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

need_functions

Overview

Installation

Quick Start

API

get_available_csv_file_names

download_and_import_data

get_need_table_names_in_database

get_metadata_columns_codes

get_distribution

About

Releases

Packages

Languages

License

building-energy/need_functions

Folders and files

Latest commit

History

Repository files navigation

need_functions

Overview

Installation

Quick Start

API

get_available_csv_file_names

download_and_import_data

get_need_table_names_in_database

get_metadata_columns_codes

get_distribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages