The OpenEarth philosophy related to data involves that important data sets, rather than working on a project-by-project basis, should be collected and be made available in a project-superseding manner. intends all datasets to conform with a number of basic quality criteria:
data is not just numbers and meta-information, but consists of raw data produced by the measuring equipment (e.g. volts) + processing scripts.
raw data should be stored in the OpenEarthRawData repository enabling version control
raw data should then be enriched with metadata and processed into useful data products (netCDF) using transformation scripts also put under version control in a repository
resulting data products should conform to the best open source standards available
data products should be made available easily via webbased interfaces but also with automated procedures for widely-used data processing languages (matlab, IDL, python, fortran, C, java)
data products are primarily meant for dissemination, raw data and scripts are primarily meant for archiving.
To achieve this international standards are embraced like:
netCDF (self-describing, open source, widely-used file-format standard)
Fabric For loading data on the opendap server you can use fabric.
The data collection procedure and the relation between those standards is explained in the OpenEarth Data Standards document, developed in the framework of the EU FP7 Project MICORE. Currently data sets are being uploaded to the OPeNDAP production serversTHREDDS (default) and HYRAX (incl. kml previews for Google Earth/maps, but *.nc access not yet fully operational) and to the OPeNDAP test serverTHREDDS. Examples from open datasets from the internet on OpenEarth include:
Numerous other datasets have been or are being uploaded continually in the MICORE and Building with Nature research programmes.
OpenEarth is not the only initiative to share and disseminate government-paid Earth science data freely on the web using open standards. We made an inventory of related initiatives.