Data Lake

Error converting content: marked is not a function

- TL;DR: Start ingesting data now without being blocked by proper schema design
- Thinking about JOKR or starting a similar business. I can start adding content and build an application while allowing myself time to let the proper schema design be emerged. Conceptually, this is data lake.
- From my fav Books/Designing Data-Intensive Applications
- Collecting data in its raw form, and worrying about schema design later, allows the data collection to be speeded up (a concept sometimes known as a “data lake” or “enterprise data hub”
	  id:: 6276fd7b-de7c-4c3c-a991-00517b8f4306
- Indiscriminate data dumping shifts the burden of interpreting the data: instead of forcing the producer of a dataset to bring it into a standardized format, the interpretation of the data becomes the consumer’s problem (the schema-on-read approach)
- This approach has been dubbed the sushi principle: “raw data is better”