-
Notifications
You must be signed in to change notification settings - Fork 2
The Warning Framework
As of version 1.1 there is a Warnings Framework in place. With version 1.1.4 we have created our own implementation, leaving out the command-line interface and adding some extra filtering options and making it more thread-proof. It will post a Warning when any non-fatal error occurs while processing a data page. You can set what will be done with the warnings through the warnaction option on creating a HTMLtree, JSONtree or DataTreeShell object. warnaction is a string and can have the following values:
- "error": turn warnings into exceptions
You probably do not want to use this one for most, but it can be handy for severity 1 errors. - "ignore": never print warnings
This could be be a good setting once your application is tested and working - "always": always print warnings
This is a good setting while testing a new data_def - "default": print the first occurrence of a warning for each location where the warning is issued
This is the default. If you send the output to a logfile this is a good setting. - "module": print the first occurrence of a warning for each module where the warning is issued
- "once": print only the first occurrence of a warning, regardless of location
By default warnings are send to sys.stderr. However you can change this to any file or Queue object with the warngoal option on creating a HTMLtree, JSONtree or DataTreeShell object. A Queue object is handy when you are working with multiple threads, with your logger on its own thread at the end of the Queue.
The Warnings are grouped in an hierarchical structure of UserWarning classes. These are a special case of exceptions:
-
dtWarning: All warnings emitted by DataTreeGrab
- dtDataWarning: Warnings on reading your data into the tree or when trying to extract data from an invalid tree.
-
dtdata_defWarning: All warnings emitted while processing a data page through a data_def
- dtParseWarning: Warnings while parsing through the node_defs in a data_def
- dtCalcWarning: Warnings while processing any of the Data Manipulation statements
- dtUrlWarning: Warnings while extracting an url through DataTreeShell
- dtLinkWarning: Warnings while processing the link functions through DataTreeShell
As of version 1.1.4, on starting a HTMLtree, JSONtree or DataTreeShell
you can also supply a caller_id which defaults to 0
. If you have more then one tree in your script it helps identifying the source of a warning. Also all warning filters are added under this number and for every tree only filters with this number or with number 0 are used. The caller_id of the tree issuing the warning is named in the warning text.
Also all warnings now have a severity ID of 1, 2 or 4:
- 1 meaning probably fatal. These are mostly dtDataWarnings, but also a missing or invalid part of your data_def will be seen as severity 1
- 2 meaning the warning probably was caused by an error in your data_def
- 4 meaning something occurred, but it very probably was not serious and caused by variations in your data page.
If you choose to use a queue as your warngoal every warning is put there as a three part tuple:
You can access the active warnings object for detailed handling through sys.modules['DataTreeGrab']._warnings
. There are two functions to add filters and one to reset them:
- simplefilter(action, category=Warning, lineno=0, append=0, caller_id = 0, severity=0)
- filterwarnings(action, message="", category=Warning, module="", lineno=0, append=0, caller_id = 0, severity=0)
- resetwarnings(caller_id = 0)
These are similar to the functions in the python Warnings Class but with caller_id and severity added as filtering parameters.
You can also as of version 1.3.1 access the simplefilter function through the HTMLtree, JSONtree and DataTreeShell classes. You here cannot specify a caller_id as the caller_id used on initializing the class is used.
For instance:
sys.modules['DataTreeGrab']._warnings.simplefilter("error", dtDataWarning, caller_id = caller_id, severity = 1)
will turn dtDataWarning
warnings of severity 1 for the DataTree into exceptions. These can occure on DATAtree initialization and on the find_start_node, find_data_value and extract_datalist functions functions. For the DataTreeShell class they can occure on the init_data_def, init_data and extract_datalist functions.
Always add specific filters after initializing the DATAtree or DataTreeShell classes as on initializing, the generic rule for that caller_id is added to the start of the list of filters and all filters for that caller_id are removed. Before version 1.3.1 this also happend on initializing the DATAtree from DataTreeShell on calling the init_data function. Now if a caller_id already exists and warnaction is set to None all filter rules are left as is. If the caller_id does not jet exist None is replaced by the "default" warnaction. For the DATAtree classes therefore a not specified warnaction defaults to None. The DataTreeShell class will still always reset the warning filters on initialization, defaulting to "default".
Next to using these warnings to detect problems several functions as of version 1.3.1 return an errorcode. These functions are find_start_node and extract_datalist functions from the DATAtree classes and init_data and extract_datalist form the DataTreeShell class. This is a value ranging from 0 to 7. As the exact values could change in the future you should use the following constants to check:
- 0: DataTreeGrab.dtDataOK
- 1: DataTreeGrab.dtDataInvalid
- 2: DataTreeGrab.dtStartNodeInvalid
- 3: DataTreeGrab.dtDataDefInvalid
- 7: DataTreeGrab.dtNoData
The values 4 to 6 are reserved for future codes. As a non zero value is interpreted as True, you can also evaluate the return value as a boolean. A True value will mean the function failed (or on code dtNoData returned no data).
Also the DataTreeShell class stores the current code in the DataTreeGrab.errorcode attribute. On initializtion untill a dataset is properly read into the tree it will have a value of DataTreeGrab.dtDataInvalid. Next to the above errorcodes it can also contain minor errorcodes occuring on data_def and data initialization not necessarily resulting in a failure added to the main state-code. At present these are:
- 8: DataTreeGrab.dtSortFailed
- 16: DataTreeGrab.dtUnquoteFailed
- 32: DataTreeGrab.dtTextReplaceFailed
- 64: DataTreeGrab.dtTimeZoneFailed
- 128: DataTreeGrab.dtCurrentDateFailed
Use DataTreeGrab.dtFatalError to filter out the main state-code or use:
DataTreeGrab.check_errorcode(only_fatal = True, code = None, text_values = False)
By default it will return the main state-code between 0 and 7. Set only_fatal to False to get the full value. Giving code an integer value will first check if it contains a main state-code that matches the actual value and return that added with the result of the rest of code value anded with DataTreeGrab.errorcode. If both result negative None is returned. If the main state-code part in code is the full 7, the actual main state-code is included in the return value.
If you want to use the value in a log you can set text_values to True. A list of strings will be returned describing first the main state-code followed by any minor codes as were included in the integer return value. The above None value equals an empty list.
Glossary
accept-header
autoclose-tags
caller_id
current_date
current_ordinal
child_index
data_def
data-format
DATAnode
DATAtree
date-range-splitter
date-sequence
date-splitter
datetimestring
default-item-count
empty-values
enclose-with-html-tag
encoding
init_def
item-range-splitter
key_def
key-node
link_def
link-value
month-names
name-value
node_def
NULLnode
path_def
.print_searchtree
relative-weekdays
root-node
severity
.show_result
start_node
str-list-splitter
text_replace
time-splitter
time-type
timezone
unquote_html
URL_def
url
url-data
url-date-format
url-date-multiplier
url-date-type
url-header
url-type
url-weekdays
value_def
value-filters
warngoal
weekdays