Notes:
- Huge amount of data
- Frequent updates
- Unstructured content
- Un-indexable content (images, binary files, proprietary formats)
Notes:
- What can be the technical challenges?
- Expertise
- Authoritativeness
- Trustworthiness
Notes:
- Come up with examples of websites that rank high on one of those scales.
- Spam
- Duplicate content
Notes:
- Come up with examples. How could you detect this?
- User intent
- Misspellings
Notes:
- How could you detect user intent?