You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the dictionary describes only the column names, but not the full contents or context of the dataset, e.g., the object it describes or what rows mean, what the files are
there is possibly a separate extraction process which might be worth to report as well, but perhaps best done in a second step after a good description of the individual extracts are available.
this might be important in particular for the purpose of model building, e.g., to understand the "full" data set for the purpose of creating a simulator that produces data of the same type
For convenience, I have added a link to a "data dictionary writing guide" I created a while ago, here:
Side note: There is the raw data and the processed data. If we would expect most analyses to start at the processed data, it might be worth writing clean data dictionaries for both data batches.
The text was updated successfully, but these errors were encountered:
Short review of the data dictionary as current in
https://github.com/RobotPsychologist/bg_control/blob/main/data/data_dictionary.md
I have two main comments:
For convenience, I have added a link to a "data dictionary writing guide" I created a while ago, here:
https://github.com/sktime/datadict-howto
Side note: There is the raw data and the processed data. If we would expect most analyses to start at the processed data, it might be worth writing clean data dictionaries for both data batches.
The text was updated successfully, but these errors were encountered: