Unstructured Data

Defining Unstructured Data

Unstructured data is any data that has the following characteristics:

  • No predefined structures
  • No explicit data types
  • No labels, fields, or other ways to annotate specific datum
Caution

Unstructured data is historically hard to process. Ideally you will have some specialized tooling in place that can process the unstructured data then store the resulting information in a structured or semi-structured manner

Examples

  • Videos
  • Images
  • Audio Files
  • Large free text files

Storage

Typically unstructured data should just live in a file system or an Object Store