Hamburg Open Science Metadata Transformations
Automated workflow for harvesting, transforming and indexing of metadata using metha, OpenRefine and Solr. Part of the Hamburg Open Science “Schaufenster” software stack.
We at State and University Library Hamburg are building a website that shall aggregate Open Access content from local universities. Therefore we build an automated workflow for harvesting, transforming and indexing of metadata using metha, OpenRefine and Solr with simple bash scripts.
Workflow:
1. Harvest metadata in different standards (dublin core, datacite, …) from multiple OAI-PMH endpoints
2. Transform harvested data with specific rules for each source to produce normalized and enriched data
3. Load transformed data into a Solr search index (which serves as a backend for a discovery system)
Why OpenRefine?
Non-tech-savvy library staff are able to use a graphical user interface for exploring the data, creating the transformation rules and checking the results.