GEOSPEX Custom Banner

GEOSPEX | geospatial analysis | urban planning | environmental issues

Getting a handle on GeoData- Part II

Following Part I ‘Getting a Handle on GeoData’, this post has been derived primarily from Rob Kitchin’s new publication Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. If these posts are of interest, I strongly recommend getting the book as my culled points are designed specifically to address issues that I foresee as valuable for an upcoming, introductory lecture on GeoData- and they admittedly missing an awful lot of valuable material found in the book.

In Kitchin’s second chapter, Small Data, Data Infrastructures & Data Brokers, the stage is set calling out ‘Small Data’ against the hyped, emerging backdrop of ‘Big Data’. Doing a quick image search for ‘Big Data’, visual analogies fall generally into 3 types- Big Data as cities; Big Data as tubes and Big Data, the world at large. Visually, there is a real lack of precision as to what Big Data actually is except for zero’s and one’s spanning across BIG THINGS. Kitchin takes on the vagueness of Big Data by situating it against Data we already know well….

BDall [ Big Data doing Big Things! ]

KEY PT. 12

We can know Big Data by what it is NOT (small data).

“Until [recently] the term ‘small data’ was rarely, if ever, used. Its deployment has arisen purely as the de facto oppositional term to so-called ‘big data’.”

“Small data may be limited in volume and velocity, but they have a long history of development, with established methodologies and modes of analysis, and a record of producing answers to scientific questions.”

“Big data generally capture what is easy to ensnare- data that are openly expressed (what is typed, swiped, scanner, sensed, ect.; people’s actions and behaviors; the movement of things)….”

Small DATA are those map portals and ‘tailored’ spatial repositories we’ve known and used day-in-and-out for the last two decades.

Following Kitchin’s line of thought of understanding Big by way of Small, private actors like Facebook, Google, Wal-Mart, Amazon and Netflix do not particularly attend to ‘tailored’ data, small data. They are in it for data derived from people’s (consumer’s) actions and behaviors, and they see the value, especially, of ALOT of this Big Data. As a tag-along, Data Brokers, Aggregators, Consolidators and Resellers repackage Big Data, creating not a ‘Data Commons’ of small data, but rather proprietary, closed products. We know them, again, vaguely, as Epsilon, Acxiom, Datalogix, Alliance Data Systems, eBureau, ChoicePoint, CoreLogic, Equifax, Experian, ID Analytics, Infogroup, Innovis, Intelius, Recorded Future, Seisint and TransUnion, ect.

Yeah… not the first stop for a GIS analyst or cartographer looking for open, well-managed spatial data!

KEY PT. 13

The Big Data hype has effectively masked criticism of the emerging Big Data industry.

“By gathering together large stores of small data held by public institutions and private corporations and mashing them together with big data flows, data brokers can produce various kinds of detailed individual and aggregated profiles tha can be used to micro-target, assess and sort markets, providing high-value intelligence for clients.”

“Interestingly, given the volumes and diversity of personal data that data brokers and analysis companies possess, and how their products are used to socially sort and target individuals and households, there has been remarkably little critical attention paid to their operations.”

“At present, data brokers are generally largely unregulated and are not required by law to provide individuals access to the data held about them, nor are they obliged to correct errors relating to those individuals (Singer 2012b).”

KEY PT. 14

As a corollary to Big Data, the Open Data Movement has emerged as an increasingly effective approach to ‘opening’ closed data- often in the public sector, often targeting Small Data

The Open Data Movement seeks to radically transform this situation [closed and/or proprietary data], both opening up data for wider reuse, but also providing easy-to-use research tools that negate the need for specialist analytic skills. The movement is build on three principles: openness, participation and collaboration (White House 2009).

In particular, attention has been focused on opening data that has been produced by state agencies (often termed public sector information – PSI) or publically funded research.

principles [ OpenGovData ]

For Berners-Lee (2009), open and linked data should ideally be synonymous and he sets out five levels of such data, each with progressively more utility and value. His aspiration is for what he terms five-star (level five) data- a fully operational semantic Web.

5star [ 5 Star Open Data ]

KEY PT. 15

Even as the Open Data Movement has shown some successes, politics, sustainability, utility and usability are all significant, unresolved issues. In short, Open Data has its own hype issues which can mask important and controversial ramifications of the movement.

At one level, the case for open and linked data is commonsensical…However, the case…is more complex, and their economic underpinnings are not at all straightforward.

Much more critical attention then needs to be paid to how open data projects are developing as complex sociotechnical systems with diverse stakeholders and agendas.