57 ↘
FOCUS ON CONTEXT AND
DOCUMENTA-
TION
.
Earlier efforts have shown that the biggest barrier to data use is not access , it is understanding .
Therefore , we have focused on making the data products as useful and easily understood as possible .
To accomplish this , we are creating a “planet and moons” model where related artifacts are generated and stored in the same place as the data .
A
few of the proposed “moons” include : ● An html data profile to surface data types , distributions , factor levels , and labels .
● Multiple file formats (csv/parquet/SAS7BDAT/Excel) to enable multiple toolkits .
● Business glossaries with human-readable descriptors .
● Governance documentation including data stewards , access policies , data lifecycle requirements , and contact information .
● Access logs to show who else is using the data .
This mechanism will allow new product users to ask questions of people who have previously accessed the information .
● Completed analyses and results .
This is intended to scale knowledge to expedite actionable interventions .
● Version-controlled code (SQL , Python) so advanced users can reproduce the product from source data systems .
↘ Figure 1 :
A
Boring Analytic Rectangle ↘ Figure 2 : "Planet and Moons" Model where artifacts are stored with data