Posts
2020-05-17
Understand three kinds of data
It’s been a while I try to summrize my scattered notes from my 5-year data engineering career. I was not data scientists or analytists, instead, I worked to build up data pipelines, from ingestion, transmation, to cleansing. My goal is to makes sure the data can be safely transferred from frontend to offline store, converting from raw format (usually text logs) to a self-explanable data structure, with reasonable data quality. It was painful when I stepped in, being bitten by pitfalls from existing system and errors made by myself.