A Publication of Word-Information Institute
Konrad Becker/ Felix Stalder [eds.] Studienverlag & Transaction Publishers, 2009.
ISBN 978-3-7065-4795-6
Information is useless if it cannot be found and it is not a coincidence that a search engine like Google has turned into one of the most significant companies of the new century. These engines are never just practical tools to deal with information overload. Such cognitive technologies embed political philosophy in seemingly neutral code.
Konrad Becker, Felix Stalder, editors of Deep Search.
When taking leave from the many new people I meet, often they ask me if I am on Facebook, or they tell me to Google them. I reply that I am not on Facebook. What, not on Facebook? is the usual response. When I start to explain about privacy issues or that I want my search to be unique and not recommended, an expression of shock appears on their faces. If I continue and ask them whether they care that their data is sold to third parties and their search recommended I am warranted the retort: I have nothing to hide so I think its OK and there is more security. For those of us who still want to have some control over how we search, find information, experience the internet, and would like to understand legislation that will be able to clarify and protect our rights as citizens, Deep Search (2009) is still a relevant book. Even more so since December 4, 2009, when Google announced, ‘personalised search for everyone’, combining Page Rank and personalization, altering relevance and how we find and consume information.
Deep Search- The Politics of Search Beyond Google collates 13 texts that investigate the social and political dimensions of how we navigate the deep sea of knowledge. It addresses some key questions of contemporary society: What do we gain and what do we lose as a consequence of the mechanisation of communication? Where is the emancipatory potential of having access to such vast amounts of information? What are the dangers of our reliance on search engine algorithms? Although maybe not able to answer all of these questions, this book speculates on alternatives; mapping out the field surrounding the concept of search, an activity with which right now 25% of the world is engaged with on a daily basis. As the editors Konrad Becker and Felix Stalder state in an introductory note, these questions of culture, context and classification in information systems should not be ignored since what is at stake is nothing less than how we, as individuals and institutions, come to find out about the world, notwithstanding what happens to our data.
This book takes us on an exploratory journey around the concept of search, information systems, legislation and speculative thinking on the subject and attempts to explain how we search. Ultimately, one key question is whether we will let a corporation mediate our experience of the Internet. In referring to the monopoly of Google as the primary search engine for much of the western world it is here portrayed as an advertising company- at least 90% of its revenue comes from advertisers. ‘Millions of internet users are, willingly or not, participating in this process by freely providing these companies with their profiles and attention, the currency of the Internet’.
Table of Content
The editors have divided the book into four sections for usability that outline the following categories: histories, liberties, power and visibility, although many chapters return to the theme of Google’s near monopoly on the processes of search. Specific questions put forth in the introduction include issues of search as a concept, questions of classification and the emancipatory potential of having access to vast amounts of information. Why this book though, when we can Google all of the answers to our queries? Because as the editors state in the introduction ‘it’s crucial to acknowledge both the wilful design [of search algorithms]… are neither natural nor arbitrary. People accept this uncritically. Search in itself is just one of the products that creates an environment in which access to individual users, identified through detailed, personal data profiles, can be monetized.
Throughout the book, Deep Search explains what happens to our data, who obtains it, what is it used for and ultimately, whether we can control its dissemination. For many, Google helps us find things on the web and answers our queries, ostensibly for free. Google does not (yet) charge it users for search while supposedly providing you with the world’s information. Nowadays commerce dominates the web and advertising (Adwords) support Google’s other ‘free’ services. The advertising companies pay and the trade off is that you don’t have to click on the ads.
This year the internet turns 40. Whilst Facebook surpasses Google with the number of links, Google tweaks its secret algorithms, trying to minimize farming sites which game them, and the eG8 attempts to ‘civilize’ the internet, there is now more than ever urgent need for the ability to control our own data. Deciding what we want to show, share, provide and ultimately, keep to ourselves. We are the filter that needs to have the autonomy over our data, to decide if we want to have search terms recommended, surfing personalized and our privacy to be protected by the state, the EU, a company or not at all.
Search legislation-or lack thereof
Joris van Hoboken’s article, ‘Search Engine Law and Freedom of Expression – A European Perspective’, provides a clear insight into search regulation and ultimately stresses the dominant concern that not only freedom of expression but also access to the retrieval of information and ideas is a given right.
Now new modes of suppression and breaches of privacy are cause for concern. Van Hoboken begins the article by citing a few examples of court rulings where information was removed from search indices. Yet most European legislatures have postponed dealing with the liability of search engines for nearly a decade. The legal complication in European law is especially copyright law, where not only private individuals but also governments have requested data removal and in other continents, won in court. Search engines responded by claiming that webmasters can use the exclusion instruction, robots.txt if they don’t want to be included or found in the index.
European search engine law is far from ideal and the situation needs to change, as current levels of access to information exist despite the law and not because of it. Article 10 of European Convention of Human Rights (ECHR) freedom of expression stands for the freedom to hold opinions, receive and impart information and ideas without interference by public authorities and regardless of borders. Informed by democratic ideals, the search for truth and individual self-fulfilment contains not only freedom of speech but communicative freedom, the freedom to search: to gather, look for access and to transmit. Search engines are the point of entry to the web, a dominant navigation tool – the means by which information, knowledge (and adverts) find their way to audiences.
Van Hoboken emphasises that legislature in Europe is troublesome because it is unclear. There is no legal framework for search engines to provide services that are consistent with freedom of expression. It is harder for competition to enter the market, as there is uncertainty with regard to liability, therefore making it difficult for small providers to establish themselves. In Europe, Google still dominates because of the failure of competition, such as France’s Quaero and Germany’s Theseus. Their huge legal department keeps the power instated, ‘in fact, it may be a remarkable achievement that Google, with the help of a small army of lawyers, has been able to keep its search services running in Europe.’
Search engine law and policy is still equivocal regarding restrictions on search engine logs and transparency of access by third parties. User privacy in the context of search engines is primarily a concern related to free access to information for users, yet this rich data collection (user data, IP addresses of users, unique cookie data, search queries, timestamps and user clicks) of user’s interests, activities and intentions is not protected by European law, nor does it protect users against facing suppression of results or the lack of transparency. Google ‘complies with valid legal requests for user data and gives no information about them as a matter of company policy’. In its ‘Log retention policy FAQ’ March 2007, Google defends its user data collection policies with a reference to the usefulness of data for the prevention and prosecution of crime, thereby effectively asking for data retention obligations.’ The latest update from Google is September 2008 when they announced that they would anonymize search server logs (IP addresses after 9 months).
‘Recommendation engines measure what people like me would do and telling me what that is, so I can then find out what people like me do, so I can become much more like a person like me. By telling me what people like me do, and encouraging me to be more like a person like me, they help me to become more typically one of my kind of person. And the more like one of my kind of person I become, the less me I am, and the more I am a demographic type.’ Douglas Ruskkoff
Second Index
It’s not about the information in the world but the world’s users’ information
Felix Stalder and Christine Mayer in Second Index: Search Engines, Personalisation and Surveillance explain in clear terms how Google’s search engine works and its ‘art’ of personalisation. They first ask us to rethink our search results. ‘We are presented with the picture of the world made up of what someone else, based on proprietary knowledge, determines to be suitable to one’s individual subjectivity.’ Most people think that ‘Google is a search engine, rather than a multimillion dollar cooperation making large profits from devising personalised advertising schemes.’
What many don’t realise is that our data is being used and manipulated to create what’s called the Second Index. Granted, Google doesn’t advertise this but it’s key in understanding that what we are getting is neither neutral nor objective.
The first index, or layer is created by all the public information on the web by third parties; the second is created by search engines and is proprietary. These are combined by Google to not only provide relevant search results but to provide users’ information to advertisers. With the acquisition of You Tube and Blogger, Google not only acquired our search queries but along with that our passwords, account activity and login data. They have merged all of the data and even now there is the consequence of mobile phones: ‘User-specific tracking: location data can also be accessed by third parties without a user’s knowledge or consent and that the victim may remain unaware of being tracked. Thereby rendering ordinary mobile phones useful tools for personal surveillance.’
Stalder and Mayer then elaborate on the Second Index as all about ‘personalisation’. What happens with our data when it is used to shape a personality profile or to present her or him with criteria that is not necessarily one’s own? There is manipulation. Yet this is somewhat paradoxical, as they go on to state, personalisation diminishes the autonomy of the individual user and at the same time enhances it: it shows information otherwise hard to locate and improves the quality of search experience. This personalisation is a second layer of transparency placed on top of the opacity of the general search algorithms.
They question what can be done with the Second Index, now and in the future. Search engines help expand Big Brother capabilities yet that watching is not done by a central organisation; rather the search engines provide the means for everyone to place themselves at the centre and watch without being watched (at least not by the person being watched)’. Yet another risk to consider- deleting means ‘anonymizing’ your data. Can this process be reversed by third parties or Google?
Here lies the core of the problem- the naïve trust that users uphold is misleading. The trusting system is becoming more opaque instead of more transparent. Stalder and Mayer state that we need a collective means of oversight- a regulatory system that applies legal and social pressure to force companies to grant consumers basic rights. Open source processes, heterogeneous value sets and to evaluate capacities of a system. Right now there is no substantial restriction and we need to safeguard our liberty and autonomy!
Generally Deep Search makes the point that the focus should not be on customer experience, but citizen experience. We should not have to swap knowledge and connectivity for privacy and personalization. Right now we are shaped by what we want before we even buy it, through studying past purchase history and ‘behavioural targeting’, Google discovered early on that is was a market researching tool figuring out what you wanted to buy. Typing in words gives them your intent and Google rates things as having the most interestingness, using patterns of data to analyse and construct profiles of its users. Adwords forms the major business model where a single person, with a single word, initiates a search as a consumer looking for something. In Google jargon, ‘narrow casting’ not broadcasting, speaks to each user as an individual with the goal to filter consumer relevant ads. Value is determined by how many eyeballs are looking at that content and the product online is YOU!
Autonomy of search
If the web should be an open network, easy to search where we can go everywhere, are we not concerned about our personal space within it and the data trails we leave behind? While searching for information our curiosity is being exploited for business; we pay with our data the longer we are online, the more attention we spread around. Does this not defeat the whole point of the web, in which surfing could lead to chance encounters and where searching leads to new ideas?
In her article, From Trust to Tracks – A Technology Assessment Perspective Revisited, Claire Lobet-Maris maps out the questions encircling ‘replis identitaires’ (self-centred identity politics), along with concepts of autonomy and democracy. What is the solution to competing with a monopoly? How to make Google become more transparent? She compares it, for example, to electricity. ‘So to mainstream the Internet as a global public good, the internet must be regulated- states should play an active role fostering the transparencies of the patterns and metrics used by search operators in order to make their scripts as readable as possible.’ These filters or scripts mediate our interactions, and also our access to information and knowledge that structure them.
She concludes that to fight the privatisation of the public sphere there are three main issues when examining the question of search engines: equity and respect for minorities, the diversity of the new public sphere, and the issue of transparency of the regulation that supports its organisation. If we want to have control over our individual freedoms and to protect privacy we need to have the autonomy and power to do so. For example we need to be able to access the back-end, manage our profiles and have transparency from those who use our data. Taking inspiration from the so-called ‘Maoist cleaners’, a variety of technologies give people the opportunity to reset their profiles, remove out-dated and prejudicial links and restore their intellectual rights. We need to have the ability to also delete our own data and Lobet-Maris advocates the political empowerment of citizens beginning with education, along with innovation by investing in research programmes that enable citizens to manage and control their own tracks.
Where history meets the future
Contained within the book are comprehensive explanations of Page Rank and the Second Index, yet what seems to be missing is a technical explanation of how these Google algorithms work. Granted, that is somewhat difficult because many are not transparent. Which, even in layman’s terms, would be of considerable value in understanding them. Where is the ‘how to’ guide, for example on installing software that prevents Google retaining information beyond deleting cookies, even for non-techies? Even though there are many footnotes and comprehensive bibliographies do I have to search online with Google or can I find this information in books, printed matter? Without naming names what are the preventive measures one could take to be protected?
We supply and give information freely every second as we work, live, search. But larger questions still loom such as combatting Google and protecting our information. As users we have to have autonomy in our search and not have a corporation control our freedoms and ontological structures. For example, we can use ‘track me not’ to interfere with Google’s search engine, Scroogle, which deletes our search data or add robots.txt. to the bottom of our pages to prevent their automatic indexing or use alternative search engines. Other modes of resistance might include the concept of adding noise, where people can add whatever junk they want to the web as well as signing in with other avatars or identities, or giving false search queries. As users of the web we need the right to decide what we do with our data, the freedom of choice to delete content and user profiles or resort to ‘suicide machines’ to opt out of social communities.
Personalisation disrupts public and private space. Computers are public, not private, as they conduct transactions with many computers across the globe. We trade off privacy every time we search; our data is saved and stored. Our privacy has become a commodity in this experience of search, with Google tracking our browsing habits, scanning our emails and matching them with ads. The content is revealed to outside parties and becomes public. Google states that they need this data to improve our experience. Billions of users living in online commercial spaces, search engines and social media provide this worldly interconnectedness but at the cost of privacy. If our digital fingerprint is every bit as valuable as our privacy then it begs the question, do we want so many details available to so many people, now and perhaps even worse, in the next 10 years? Who might own this data in an unknown future?
Society of the Query
Google provides access, information and services while simultaneously monitoring our complacency. Search is inherent to how we live and our dependency on a singular search engine is problematic, as well as only searching within Chrome or Facebook. Turning the pages of Deep Search it becomes clear that social relations, expressed in the form of links, popular ones especially, are the determining factor in the competition for the top of the page. A popularity contest in other words. In 2011 a high ranking in search results is contingent upon the number and quality of links to your website. That which is most visible, holds the attention of the majority and (in the Western world at least) it is mostly Google that finds it, Google that determines its visibility. Most people (95%) rely on the first page of their search results.
In his article, Society of the Query: Googlization of our lives, Geert Lovink focuses on the editor, or filter for all this information in our society and not that there should be just one, but manifold or a ‘distributed effort, embedded in a culture that facilitates and respects, difference of opinions.’ Lovink emphasises the information and communication overload, not being able to keep up with personal emails, or chatter. But what is increasingly problematic is that ‘the educated class is concerned that chatter has entered the hitherto protected domain of science and philosophy, when instead they should be worrying about who is going to control the increasingly centralized computing grid.’ Not only that, but the migration of data to private storage apparatus, centres is increasing. ‘Security and privacy of information are rapidly becoming the new economy and technology of control. And the majority of users, and indeed companies, are happily abandoning the power to self-govern their informational resources.’
With the advent of search engines we are reliving Guy Debord’s ‘Society of the Spectacle’ where not only the rise of media is inherent in our daily lives but our interaction and contribution to these constant new forms of media. Lovink states that no strategy has been devised to live in the age of the post-spectacle, as ‘distributed actors’ we spread our data willingly. Providing the data and attention is the currency of the Internet in which we all are involved as users and to which we contribute freely. Only when privacy plays a crucial issue for enough people will we take action. The monitoring of the user’s behaviour in order to sell data traffic and profiles to interested third parties should be our concern. Clearly the dangers of relying on search monopolies and how they make use of our information need not only to be articulated but also disseminated. Therefore we need a public sphere that neither allows the market nor state surveillance to have the upper hand.
Do we then now live in the society of links instead of the society of the query? Search as a mass medium is powerful, but it’s also open to abuse. Understanding and questioning what is going on with our information and simultaneously discovering new ways to protect it are valid. Coming up with the critical questions, new ways that rethink the whole concept of search is on the agenda. We need to understand more about how search actually works, not only focusing on changing algorithms but the development of alternative search engines, which are not premised on popularity, link structure and hence Page Rank. Should it not rather be the death of the popularity contest and instead try to have better access to what are considered marginal sources or control of our own identities appearing on the web? The design of search plays an intrinsic role in how we experience our world along with how we value the privacy and security of our information.
Lovink ends by stating it’s not the quality of the answers we receive that is so disheartening but the education and diminishing ability to think in a critical way. We need to invent new ways to interact with data and to respond creatively and critically. Stop searching. Start questioning.
Thanks also to all at the Society of the Query (http://networkcultures.org/wpmu/query/) for their inspiration and ideas.
For those who may not know but would like to know how search works…..
What language(s) and vocabularies do we use to articulate the objective of our search queries? How is this taxonomy to be derived when some 60-70%? of the world is not online, is not seen, nor yet indexed? An invisible world with no access. What is represented in search results are only those things contributed or uploaded, linked or referenced. A form of 21st century colonisation and balkanisation as one might put it, this time in regard not only to land as a form of territory, but the language of search, within the space of the Internet.
The eponymous Page Rank algorithm was developed in 1998 by Brin and Page, and it works like this: a query is given by the user based on keywords, these pages are uploaded and indexed by Google and contain all the hyperlink information. The more links on a page the more valuable those pages are, value being measured by being visited many times on the net. Not counting direct links, Page Rank interprets the links as votes cast, then ranks their importance. A link coming from a node with a high rank has more value than a link coming from a node with low rank. The concept of Page Rank has its basis in the Scientific Citation Index (SCI) , a form of academic hierarchy that has now been grafted as a conceptual paradigm for the way we find information and how that information is prioritised for us, designed by a monopoly, a corporation called Google.