Skip to content

The Warning Framework

Hika van den Hoven edited this page May 5, 2017 · 13 revisions

Warnings, Errors and other messages

Warnings Error Codes

The Warning Framework

The python Warnings Class

As of version 1.1 there is a Warnings Framework in place. With version 1.1.4 we have created our own implementation, leaving out the command-line interface and adding some extra filtering options and making it more thread-proof. It will post a Warning when any non-fatal error occurs while processing a data page. You can set what will be done with the warnings through the warnaction option on creating a HTMLtree, JSONtree or DataTreeShell object. warnaction is a string and can have the following values:

  • "error": turn warnings into exceptions
    You probably do not want to use this one for most, but it can be handy for severity 1 errors.
  • "ignore": never print warnings
    This could be be a good setting once your application is tested and working
  • "always": always print warnings
    This is a good setting while testing a new data_def
  • "default": print the first occurrence of a warning for each location where the warning is issued
    This is the default. If you send the output to a logfile this is a good setting.
  • "module": print the first occurrence of a warning for each module where the warning is issued
  • "once": print only the first occurrence of a warning, regardless of location

By default warnings are send to sys.stderr. However you can change this to any file or Queue object with the warngoal option on creating a HTMLtree, JSONtree or DataTreeShell object. A Queue object is handy when you are working with multiple threads, with your logger on its own thread at the end of the Queue.
The Warnings are grouped in an hierarchical structure of UserWarning classes. These are a special case of exceptions:

  • dtWarning: All warnings emitted by DataTreeGrab
    • dtDataWarning: Warnings on reading your data into the tree or when trying to extract data from an invalid tree.
    • dtdata_defWarning: All warnings emitted while processing a data page through a data_def
    • dtUrlWarning: Warnings while extracting an url through DataTreeShell
    • dtLinkWarning: Warnings while processing the link functions through DataTreeShell

As of version 1.1.4, on starting a HTMLtree, JSONtree or DataTreeShell you can also supply a caller_id which defaults to 0. If you have more then one tree in your script it helps identifying the source of a warning. Also all warning filters are added under this number and for every tree only filters with this number or with number 0 are used. The caller_id of the tree issuing the warning is named in the warning text.
Also all warnings now have a severity ID of 1, 2 or 4:

  • 1 meaning probably fatal. These are mostly dtDataWarnings, but also a missing or invalid part of your data_def will be seen as severity 1
  • 2 meaning the warning probably was caused by an error in your data_def
  • 4 meaning something occurred, but it very probably was not serious and caused by variations in your data page.

If you choose to use a queue as your warngoal every warning is put there as a three part tuple:

You can access the active warnings object for detailed handling through sys.modules['DataTreeGrab']._warnings. There are two functions to add filters and one to reset them:

  • simplefilter(action, category=Warning, lineno=0, append=0, caller_id = 0, severity=0)
  • filterwarnings(action, message="", category=Warning, module="", lineno=0, append=0, caller_id = 0, severity=0)
  • resetwarnings(caller_id = 0)

These are similar to the functions in the python Warnings Class but with caller_id and severity added as filtering parameters.
You can also as of version 1.3.1 access the simplefilter function through the HTMLtree, JSONtree and DataTreeShell classes. You here cannot specify a caller_id as the caller_id used on initializing the class is used.

For instance:

sys.modules['DataTreeGrab']._warnings.simplefilter("error", dtDataWarning, caller_id = caller_id, severity = 1)

will turn dtDataWarning warnings of severity 1 for the DataTree into exceptions. These can occure on DATAtree initialization and on the find_start_node, find_data_value and extract_datalist functions functions. For the DataTreeShell class they can occure on the init_data_def, init_data and extract_datalist functions.
Always add specific filters after initializing the DATAtree or DataTreeShell classes as on initializing, the generic rule for that caller_id is added to the start of the list of filters and all filters for that caller_id are removed. Before version 1.3.1 this also happend on initializing the DATAtree from DataTreeShell on calling the init_data function. Now if a caller_id already exists and warnaction is set to None all filter rules are left as is. If the caller_id does not jet exist None is replaced by the "default" warnaction. For the DATAtree classes therefore a not specified warnaction defaults to None. The DataTreeShell class will still always reset the warning filters on initialization, defaulting to "default".

Error codes

Next to using these warnings to detect problems several functions as of version 1.3.1 return an errorcode. These functions are find_start_node and extract_datalist functions from the DATAtree classes and init_data and extract_datalist form the DataTreeShell class. This is a value ranging from 0 to 7. As the exact values could change in the future you should use the following constants to check:

  • 0: DataTreeGrab.dtDataOK
  • 1: DataTreeGrab.dtDataInvalid
  • 2: DataTreeGrab.dtStartNodeInvalid
  • 3: DataTreeGrab.dtDataDefInvalid
  • 7: DataTreeGrab.dtNoData

The values 4 to 6 are reserved for future codes. As a non zero value is interpreted as True, you can also evaluate the return value as a boolean. A True value will mean the function failed (or on code dtNoData returned no data).

Also the DataTreeShell class stores the current code in the DataTreeGrab.errorcode attribute. On initializtion untill a dataset is properly read into the tree it will have a value of DataTreeGrab.dtDataInvalid. Next to the above errorcodes it can also contain minor errorcodes occuring on data_def and data initialization not necessarily resulting in a failure added to the main state-code. At present these are:

  • 8: DataTreeGrab.dtSortFailed
  • 16: DataTreeGrab.dtUnquoteFailed
  • 32: DataTreeGrab.dtTextReplaceFailed
  • 64: DataTreeGrab.dtTimeZoneFailed
  • 128: DataTreeGrab.dtCurrentDateFailed

Use DataTreeGrab.dtFatalError to filter out the main state-code or use:
DataTreeGrab.check_errorcode(only_fatal = True, code = None, text_values = False)
By default it will return the main state-code between 0 and 7. Set only_fatal to False to get the full value. Giving code an integer value will first check if it contains a main state-code that matches the actual value and return that added with the result of the rest of code value anded with DataTreeGrab.errorcode. If both result negative None is returned. If the main state-code part in code is the full 7, the actual main state-code is included in the return value. If you want to use the value in a log you can set text_values to True. A list of strings will be returned describing first the main state-code followed by any minor codes as were included in the integer return value. The above None value equals an empty list.