You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Replace chapter name: Spam filter
By: Spam filter: an introductory classification model
Replace: How can Bayes help?
By: How can Bayes’ theorem help?
Replace section 4.2 name: The Training Data (conceptually, it may generate confusion about if pre-processing covers all the dataset and/or just the training subset after splitting it)
By: The Data // The Input Data // The Modelling Data
Define what corpus means: parenthesis, footnote.
Key ideas about the pre-processing step may be clarified by adding a few images of what you are seeing, imaging or meaning in order to allow newbies to figure it out and follow the transformation process of data and how it can be read. Maybe this or this help.
It should be explained how you start from a bunch of emails and get them transformed into a kind of tabular record and frequency accountability report in addition to other features and metadata.
In section 4.4, it may help to link with the previous chapter definition of conditional probability and independence in order to reinforce why using a Naïve Bayes classification algorithm as the first approach (simplicity, speed for naïve calculations, usual baseline, etc.). It can also be linked to the appendix 4.9 and Alpha, in order to understand why you are introducing an extra explanation about this parameter.
Regarding accuracy indicator for model evaluation, it may be mentioned that it is also a first approach measure, and other indicators are key, especially when working with unbalanced multiclass data.
The text was updated successfully, but these errors were encountered:
Chapter 4
Replace chapter name: Spam filter
By: Spam filter: an introductory classification model
Replace: How can Bayes help?
By: How can Bayes’ theorem help?
Replace section 4.2 name: The Training Data (conceptually, it may generate confusion about if pre-processing covers all the dataset and/or just the training subset after splitting it)
By: The Data // The Input Data // The Modelling Data
Define what corpus means: parenthesis, footnote.
Key ideas about the pre-processing step may be clarified by adding a few images of what you are seeing, imaging or meaning in order to allow newbies to figure it out and follow the transformation process of data and how it can be read. Maybe this or this help.
It should be explained how you start from a bunch of emails and get them transformed into a kind of tabular record and frequency accountability report in addition to other features and metadata.
Replace: probabilties together.
By: probabilities together.
In section 4.4, it may help to link with the previous chapter definition of conditional probability and independence in order to reinforce why using a Naïve Bayes classification algorithm as the first approach (simplicity, speed for naïve calculations, usual baseline, etc.). It can also be linked to the appendix 4.9 and Alpha, in order to understand why you are introducing an extra explanation about this parameter.
Regarding accuracy indicator for model evaluation, it may be mentioned that it is also a first approach measure, and other indicators are key, especially when working with unbalanced multiclass data.
The text was updated successfully, but these errors were encountered: