The Fourth Paradigm: Data-Intensive Scientific Discovery
In The Fourth Paradigm: Data-Intensive Scientific Discovery, the collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.
Introductions
Part 1: Earth and Environment
Part 2: Health and Wellbeing
Part 3: Scientific Infrastructure
Part 4: Scholarly Communication
Final Thoughts
Jim Gray on eScience: A Transformed Scientific Method
Excerpt
We have to do better at producing tools to support the whole research
cycle—from data capture and data curation to data analysis
and data visualization. Today, the tools for capturing data both at
the mega-scale and at the milli-scale are just dreadful. After you
have captured the data, you need to curate it before you can start doing any kind of data analysis, and we lack good tools for both data curation and data analysis. Then comes the publication of the results of your research, and the published literature is just the tip of the data iceberg. By this I mean that people collect a lot of data and then reduce this down to some number of column inches in Science or Nature—or 10 pages if it is a computer science person writing. So what I mean by data iceberg is that there is a lot of data that is collected but not curated or published in any systematic way.