The Virtual Revolution – Part Three, finally we’re getting there
February 14, 2010
At last the BBC’c Virtual Revolution Series series is starting to deliver with Part Three – The Price of Free.
I get the feeling that this is ground that the presenter is much more confident on. Away from that pesky technical detail which for some reason she still characterises as West Coast techno-utopian and on to the developing sociology of the world wide web. I’m sorry but you can’t say that the body of the web is independent of its internet bones. But I’ll stop flogging that particular horse as I’ve dealt with in in parts one and two of this four parter.
The first half of the programme is a pretty decent historical analysis of the development of the commercial internet, from the faltering steps of the Dot.com boom/bust (enter Martha Lane Fox of lastminute.com) and Amazon’s winning model, through Google’s idealistic beginnings and on to the global trade in personal information.
The central position of this episode is that we don’t actually know what the current winning commercial model of ‘targeted advertising using mass surveillance of web activity in order to support free at the point of delivery services’ will cost in socialogical terms in the long term. Its a good and relevant question given the relative youth, the relatively-unregulated nature of and global pervasiveness of the web, but one that you can pose about any commercial or even institutional activity.
Lets have a look at that statement; The other big ‘free at the point of delivery’ services that we get are more often supplied by government (in the UK). A few examples being the police, the health service, the armed forces & the legal system. We pay generalised taxes to support those services and the government decides how to apportion that money to those services. We don’t currently pay an Army tax which goes up every time the UK fights a war and down when peace comes (that could really change the political dynamic of war fighting, no ?), nor do we pay an explicit police tax (though much of the UK’s policing is supported by locally raised taxation rather than generalised taxation), we definitely don’t pay an NHS tax.
No, we pay income tax and VAT (purchase tax) that is raised by the government knowing about financial transactions that we as individuals choose to make. We accept that the services provided cost us money, and are willing to forgo some privacy in order that the money may be collected by an authority that is not partial or commercially oriented.
And that is the answer that this program seems to come up with; the bargain that we make with the commercial entity that is today’s web is ‘information for service and we, the service providers, will use the information however we want’. If internet users don’t know that this is the bargain that they are making they should, but at the end of the day targeted advertising is a form of taxation. The big issue with that transaction is that since the entities collecting the information are not governments accountable to electorates, they cannot be relied upon to treat the information with the respect that it deserves. Indeed as commercial organisations they cannot be relied upon to exist from one year to the next, so any regulation of data collection has a built-in trans-generational issue to get over as companies ‘inherit’ on another’s databases.
Its perhaps interesting to note that direct the parallel of this argument, the mass surveillance of web traffic by governments, is one that is massively contentious. It is challenged by legislators and civil society alike and portrayed as the end of responsible government by many and the beginning of it by some.
Next week’s program is going all psyche major and looking at a global shift in the ethics and understanding of privacy could mean. I’m going to set some homework – please read the PEW centre’s report on Teenagers use of social networks.
The Ethics of Data Mining – Mediated Memory
January 9, 2010
I’m going to pose myself questions questions here rather than answer them. Self-indulgent I know, but hell its my party and I’ll cry if I want to
Digital data has some properties that could or should impact on ethics. I’m going to take a look at three of them this time;
It is non-corporeal, so possibly not as susceptible to the ravages of time as paper would be
It is transmissible, so probably not subject to physical location
It is a record of events that may be edited or erased leaving little or no evidence of those actions
The first two points are similar, in that they relate to inadvertent data loss, but relate to ethics in very different ways. The third is a very different quality.
Non-corporeality – The onwards march of time and technology makes specific media obsolete. That is as true for spoken language as it is for other media formats, but the loss of spoken languages is a large enough topic for a post on its own, so I’ll stick to physical media.
Ask yourself – ‘When did I last buy a 35mm photographic film or a 90 minute audio cassette ?”
For myself, it’d have to a decade or more, and I owned a 35mm SLR camera until 3 years ago ! I just kept a stock of old film in a box in the fridge up to its use by date, and past in some cases. Now as I climb the technology ladder I have my music CDs as reference, but don’t need to touch them as my music is transferred from device to device with no apparent loss of quality. So long as I make those steps up the ladder while technology exists to bridge the gaps no data is lost. So here comes the first question.
Do we have an obligation to keep data in its original form and format, respecting the media that it was originally hosted in, do we have an obligation to retain the information contained in the data, or is the data disposable and only the effect of the data relevant ?
In many ways the existence of the institutions of ‘the museum’ and ‘the library’ answer this question from our ancestor’s point of view. Certainly in the UK, philanthropists saw the advancement of science and the education of the masses as a moral obligation, but what about the curation of data for historical rather than scientific purposes ?
I think that we can say that retained samples are a valuable weapon in the arsenal of scientific endeavour, without much doubt. Whether it be new species of animal in today’s world, samples of pathogens lost to science or the ability to re-examine old specimens with new techniques, a library of original, physical sample material is an essential part of science, but what about non-corporeal data ?
Audiophiles still see the crackle and hiss embodied in vinyl recordings as adding character and being more authentic than ‘clean’ digital renditions, but to me this would imply that the recording artist, as the author of their own material, would want a degraded recording. I really don’t think that’s true. If I were a musician I would want my recordings to be heard as played, not as recorded. But then the experience of listening to music is not the same as the experience of playing it, so a direct equivalence between the data as recorded and the data as experienced is going to be a tricky one and probably something to consider another day.
Can you imagine the curator of a museum in 1,000 years time carefully handling the mix tape that you made for wassername in ’88 ? But why not ? Its an excellent piece of social history communicating universal feelings and allowing later generations to connect with past generations on an emotional level. No different from a birthday party invitation sent 2,000 years ago or love letter written 4,000 years ago. But that assumes that the data can be read in hundreds of years time.
The non-corporeality of digital data does hide degradation introduced by copying and by natural stochastic processes, such as cosmic rays hitting storage media or radiometric decay. We should not consider data stored on digital media such as magnetic tapes, CDROMs or laserdisks as immune to degradation, indeed they are more prone to damage than paper in many circumstances. That’s not as surprising as it may at first seem since we have 4,000 years of paper technology under our collective belts, but less than 100 of electronic recording and experience using plastics. In my own lifetime I have seen data storage formats become obsolete (and so have you), but that’s not even considering MIME types.
Every new start-up seems to define their files in a new way. This is 100% understandable in the context of intellectual property rights and the advance of technology, but it also means that the failure of each of these companies will consign their file type to the rubbish bin. Effectively we are spawning and killing a new ‘language’ each time this happens and any data recorded in this language will need either translation into more common languages or the preservation of a Rosetta Stone for as long as that data might be preserved. In this context the decadal lifespan of CDROM and magnetic tapes starts to seem like a long-term issue and the churn of data formats the overwhealming problem. Since we cannot ethically restrict the proliferation of new languages, the best that we can hope for is that file types are translatable. Unfortunately translations almost always result in loss of data fidelity.
So the physical nature of storage media for digital data is less of a defining factor than we might at first believe, but before we reject its potential impacts there is the question of whether the original physical items need to be curated in the same way as any museum piece. Should we collect one of everything just so that the next generation has access to it in something close to context ? This question I will leave for the historians.
The non-corporeality of data may impact more through the potential for high-fidelity copying to multiple locations and this feeds neatly into out next topic;
Transmissibility – Not a new feature of data, after all the telegraph was transmitting non-corporeal data around the world in the 19th century, but perhaps I mean the ability to provide photo-realistic reproduction to anywhere almost instantly.
If we assume that we are not going to get significant data degradation through each copy (i.e. that the fidelity is retained) or that the original reference copy is still available, data becomes a commodity available on-demand. Our data networks make that possibility a reality, whether it be via broadcast media, sharing or sale through websites or via ‘direct’ network connections such as FTP or PPTP.
Generally we do not even think about whether we are receiving what is being sent, unless there is some obvious fault. Our technologies are reliable enough that, at the point of consumption, we consider the data as a good representation of the original. That is not to say that checks on data fidelity don’t happen in the background with most transmission mechanisms, they do, and a significant portion of bandwidth used on the internet is devoted to comparing sent to received.
The question then becomes why are we happy to have copies of original data many times removed from the original when the original (or a much closer removal) is available ?
It is often said that we live in a media age, but most would consider that to refer to the number and availability of media channels, when in fact the most pervasive mediated experiences that we have is with the data that makes everyday life possible.
Just to give an example; even 10 years ago I wouldn’t have dreamed of money as a mediated experience, yet in most senses it is just that. Very little cash actually flows through my wallet these days. I rarely visit a bank. I trust all those data transmissions that are running in the background to provide a very real outcome like food on the table. But why should I ? Well the honest truth in this case is that I don’t trust digital money any more than I trust hard cash. Both are copyable, both are stealable. Both are mediated experiences of wealth. To me they are no different, so I have no philosophical problem in a cashless economy. But that’s not the same as the ethics of error checking.
Engineers strive for high fidelity data transmission. It is a matter of honour and professional pride. Depending on the application it can be a matter of life an death.
Bankers (should) strive for high fidelity transactions. It should be a matter of honour and professional pride. Depending on the customer it can be a matter of life an death.
Journalists should strive for high fidelity transactions. It should be a matter of honour and professional pride. Depending on the story it could be a matter of life an death.
But past those three professions, high fidelity data transmission is mostly an aesthetic choice but not universally an ethical one. That’s why we have security certificates, passwords, ID checks and all that apparatus that at first sight looks Orwellian but, when you understand what it is compensating for, is much more depressing. Its plugging the ethics gap between personal and professional.
Is it right that we load our communications technologies sending multiple copies of data rather than consigning a single authoritative item of data to a secure store and simply reference it from there ? Computer programmers working in groups do this already using applications collectively known as Version Control Software. Could we create a Version Controlled repository for all human knowledge ?
This brings us to digital memories as editable and erasable records;
In the Version Controlled world nothing is ever deleted without consensus. Edits are recorded so that if some piece of data was to be found to lead to a dead-end you can back-track to the last useful data and try a different approach.
But what happens if a particular thread of knowledge leads to disaster/evil/daytime TV or whatever, what then ?
Without infinitely reproducible data a society can choose to forget. If data is infinitely reproducible we have to assume that a copy exists somewhere, even if it is only in a router cache that is not immediately accessible. If the potential exists for a copy to resurface, then ethically we have to consider that it will. Forgetting is not a choice that is practically open to us in the massively networked digital world.
Forget is the wrong word in some of these cases, ‘consign to the past and move on whilst retaining full knowledge’ could be a better way of putting it. For example the Truth and Reconciliation commissions in Rwanda and SA, are a way of accommodating of unpalatable facts until they fade a little before merging with the background of history. Anti-Nazi laws in Germany are there to provide several generations space from the shared horror of WW2 and they will not be seen to be successful until at least 2050, when the children of Nazis have died, so providing a removal from the first person experience.
But there are problems with the version controlled world where threads of knowledge are prioritised and the flow of history consciously re-routed.
At a personal level we loose our sense of ourselves and our innate ability to put things behind us, and even the essential personal liberty of simply growing up. Lets take a few examples to illustrate what I’m getting at;
Criminal offenses committed by persons under a certain age are usually dealt with differently to those committed by adults. The dividing line between child and adult is different under different legal systems, but it is present in the vast majority of cases. What is also present in most cases is a statute of limitations which means that offenses are deemed to have no relevance under law after a certain amount of time. The convergence of these two legal principles usually means that offenses committed as a child will be wiped ‘off the record’ after a relatively short period and the child allowed to go on with life as a reformed character having learned its lesson.
I see no reason why data should be treated any differently, yet if a copy of an old web page surfaces that contains embarrassing, or even harmful, personal information any person can act on it. The trope about ‘nothing to hide’ is idiotic. Everyone has something that they would rather wasn’t repeated ad nauseum in public whether it be bad fashion choices, awkward breakups, financial embarrassment or even a physical blemish. There is no small amount of debate on this, but one of the most interesting recent stories is this on from BBC News. How can Facebook ban you from deleting yourself from their platform ? To me this is a great piece of social commentary art.
Anyway. Too long. Move on.
Loving this app
July 4, 2009
I’m sure that there must be simply oodles of uses for this. I can’t for the life of me think of one
Buzztracker.org – geographic news search
I think that its the irrelevance of location of the news source that tickles me most. You are as likely to see an Australian paper reporting on Micheal Owen’s transfer to Man U as you are to see a Chinese site talking about the coup in Honduras. Since the vast majority of these stories are syndicated news feeds the thought that these maps of associations somehow equate to news rippling out from its origin is a stretch to say the least since there is no significant time lag difference between when Sydney sees a European story and when Beijing sees an Latin American one. But its a nice art project and I love it for that.
Ethical Data Mining
June 14, 2009
I’m starting to think about where the line should be drawn when data mining web content, specifically content provided by individuals. I almost said private individuals there, but if you are posting on a publicly visible blog, comments board or whatever, then by definition private is no longer applicable. Or is it ?
What is the difference between government agencies putting together a profile on me from my electronic footprint and me doing it on someone else as part of a scientific research project, or indeed a third party doing it for commercial reasons (thinking of Phorm) ? What are the methodological and ethical differences ? Are there any ?
There are a couple that leap to mind with regards to government responsibilities and accountability, but I’m still thinking about this so no conclusions yet.
Veracity Values
March 2, 2009
How do you know what is true ?
As a third party, how do you know that something is objectively true as opposed to subjectively true ?
Can you have such as thing as a truth ‘Richter scale’ ?
There are technological tools out there that should be able to help us with verification, even before the semantic web breaks through. Text mining should be able to pull out the key words and phrases of a document. You need to find a set of rules to feed the miner, but apart from that the issues with text mining seem to me to be logistical these days. Feed those results into the relevant search engines and databases and you should be able to automate the evidence gathering process.
Rate the relevance and ‘authority’ of the source and you could derive a ‘veracity value’ for the document.
Place some security around that veracity value and you have a ‘veracity certificate’ that has links to the evidence and a sliding scale of verifiable-ness.
The problem comes in identifying objective vs subjective. For example; take two academic papers, one on the physical qualities of a new metal alloy the other on Mayan cultural artifacts and their relevance to modern day Mexico. They have equal numbers of citations, equally authoritative reviewers, equal external coverage in conference proceedings and equal numbers and quality of references. How does a reader know which is objective and which subjective in nature ?
That’s something that humans do all the time, and frequently get wrong, but that AI has yet to approach.
Perhaps we leave that to the human for the moment and leave the automation at veracity.
The business model writes itself, so if anyone wants to try it get in touch
