GEOSPEX Custom Banner

GEOSPEX | geospatial analysis | urban planning | environmental issues

Getting a handle on GeoData- Part I

In the summer of 2014, ESRI launched their ‘open data’ portal, a bit of a jump the shark moment. Everything BIG DATA and OPEN DATA is all the rage for many converging reasons- and has been so for a while now.

ArcGIS Open Data Portal [ ArcGIS Open Data Portal ]

From an academic perspective, this is all very exciting but challenging in that its difficult to simply teach/instruct students new to GIS all the what/how’s of a system AND cover DATA sufficiently. Often students walk away with the notion Geo Data = shapefiles and rasters. How to get beyond all the file structures and formats, not to mention the system itself, AND convey important dimensions and ramifications of DATA itself?

Solution #1 Rob Kitchin’s new publication The Data Revolution:
Big Data, Open Data, Data Infrastructures and Their Consequences

data revolution [ Data Revolution ]

Published in 2014, this is an ideal guide to the essentials of what is DATA; what we are currently doing with it that is fundamentally different than in the past; and finally speculation and ramifications of both BIG and OPEN DATA for information systems.

Broken into several chapters as listed below, it occurs to me that this is the perfect outline for a complete overhaul of a DATA Lecture I’ve tried to sandwich between Intro to Vector and Raster Model Lectures! Often Introductory GIS courses really don’t consider Geo Data in depth; much less DATA itself as a stand-alone lecture topic; so this is a bit of a unorthodox approach, but one whose time I think has come.

Kitchin’s TOC:

  • 01 Conceptualising Data
  • 02 Small Data, Data Infrastructures & Data Brokers
  • 03 Open and Linked Data
  • 04 Big Data
  • 05 Enablers and Sources of Big Data
  • 06 Data Analytics
  • 07 The Governmentals and Business Rationale for Big Data
  • 08 The Reframing of Science, Social Science and Humanities Research
  • 09 Technical and Organisational Issues
  • 10 Ethical, Political, Social and Legal Concerns
  • 11 Making Sense of the Data Revolution
  • To get started, I’m planning to utilize some of Kitchin’s points that strike a cord from each chapter to form a narrative designed to live as a full-fledged lecture. This is the Part 1 post of several as I develop this Lecture over the next 2 weeks…

    KEY PT. 1

    The SCALE/SPEED of DATA today:

    “The Scale of the emerging data deluge is illustrated by the claim that ‘[b]etween the dawn of cilivlization and 2003, we only created 5 EXABYTES of information; now we’re creating that amount EVERY TWO DAYS (Hal Varian, chief economist with Goolge, cited in Smolan and Erwitt 2012).”

    3,000 Years = Every 2 Days

    Here’s Hal:
    ArcGIS Open Data Portal [ Hal Varian ]

    KEY PT. 2

    Buzz vs. Reality

    “These new opportunities have sparked a veritable boom in what might be termed ‘data boosterism’; rallying calls as to the benefits and prospects of big, open and scaled small data, some of it justified, some pure hype and buzz.”

    KEY PT. 3

    DATA is NOT Neutral

    “While many analysts may accept data at face value, and treat them as if they are neutral, objective, and pre-analytic in nature, data are in fact framed technically, economically, ethically, temporally, spatially and philosophically. Data do not exist independently of the ideas, instruments, practices, contexts and knowleges used to generate, process and analyse them (Bowker 2005; Gitelman and Jackson 2013).”

    KEY PT. 4

    DATA is NOT random- its completely SELECTED

    “Data harvested through measurement are always a selection from the total sum of all possible data available- what we have chosen to take from all that could potentially be given. As such, data are inherently partial, selective and representative, and the distinguishing criteria used in their capture has consequence.”

    KEY PT. 5

    DATA is NOT facts or information per se, and in relation to computers (and GIS) needs to be PROCESSED

    “…from a computational position data are collections of binary elements that can be processed and transmitted electroncially….data constitute the inputs and outputs of computation but have to be processed to be turend into facts and informations (for example, a DVD contains gigabytes of data but no facts or information per se) (Floridi 2005).”

    KEY PT. 6

    Data Vary across 5 main attributes

  • Form- Qualitative vs. Quantitative
  • Structure- Structured, semi-structured or unstructured
  • Source- Captured, derived, exhaust, transient
  • Producer- Primary, secondary, tertiary
  • Type- Indexical, attribute, metadata.
  • KEY PT. 7

    Quantitative data is generally MORE common than qualitative data relative to GIS (my assertion). As such, is measured across 4 general categories:

  • Nominal > Cateogrical > unmarried vs. married
  • Ordinal > Rank Order> low, medium, high
  • Interval> Measured across a certain scale> Temp across Celsius scale
  • Ratio> Scale posesses a true zero origin> Exam on a scale 0-100
  • KEY PT. 8

    Primary vs. Secondary vs. Tertiary COLLECTION processes

  • Primary > Researcher
  • Secondary > One person’s primary data can be used ‘secondarily’ by another
  • Tertiary > Derived data- counts, categories and statistical results.
  • KEY PT. 9

    Indexical vs. Attribute vs. Metadata TYPES of data

  • Indexical > Indexical data are those that enable identification and linking (unique identifiers- important in GIS)
  • Attribute > Represent aspects of a phenomenon, but not indexical in nature.
  • Metadata > “Data about data”.
  • KEY PT. 10

    Regardless of how you think about DATA, its @ the bottom of the ‘Knowledge Pyramid’

    pyramid[ Knowledge Pyramid ]

    KEY PT. 11

    Databases and Data Infrastructures: where/how DATA is stored, organized and put to work- Not a benign process at all.

    “Data infrastructures host and link databases into a more complex sociotechnical structure. As with databases, there is nothing interent or given about how such archiving and sharing structures are composed.”