State of the Web – Fuck you, computer

Botocracy

In May of this year, with the internet fully in the grip of the UK EU referendum, hashtags used on Instagram showed that discussion was highly polarised between #Leave and #Remain. The high degree of ideological distance between the two camps indicated that each group functioned as a separate ‘echo-chamber’, in which they spoke mainly to their own membership. The Leave campaign had a much more coherent online identity, made better use of hashtags in general, and was simply more active in generating content, all of which may have contributed to their successes. In early June 2016, a study of Twitter content found similar biases: Out of 1.5 million individual tweets, 54% were pro-Leave, and only 20% were pro-Remain. Those findings are interesting enough on their own, but what really sparked our interest, was that a third of the sampled content was created by only 1% of relevant user accounts.

If you’re not familiar with the inner workings of Twitter: No-one has that kind of time. It is highly unlikely that all of those accounts were directly controlled by people, or even large groups of people, and much more likely that many were staffed by automated software robots, or ‘bots’: Simple computer scripts that simulate highly repetitive human activity. In fact, an independent analysis of the 200 Twitter accounts which most frequently shared pro-Leave or pro-Remain content found that only 10% of those accounts were likely to be human.

The EU-referendum is not the first time ‘bots have been observed in democratic discussion. In the 2010 US midterm elections bots were actively used to support certain candidates and hamper others. In 2012 Lee Jasper admitted to their use in a parliamentary by-election. In the 2012 Mexican elections, Emiliano Treré identified a more effective use of bots, calling it “the algorithmic manufacturing of consent”, and a form of ‘ectivism’ (which includes the creation of large numbers of false followers, a charge levelled at Mitt Romney during the 2012 US Presidential election). A very large ‘bot-net’ was also utilised in 2013 to produce apparent support for a controversial Mexican energy reform. Those bots may have gone entirely unnoticed had they not been operating too rapidly to successfully pose as human agents.

Bot-related tactics have not been confined solely to the generation of apparent support, but have also been used to drown out members of a campaign by rendering their hashtags useless. The challenge presented by bots is not the introduction of false information, but the falsification of endorsement and popularity. Political discussions around the popularity of a single issue are particularly vulnerable, as are the financial implications of stock-confidence. During 2014 a bot-campaign elevated the value of tech-company Cynk from pennies to almost $5 billion USD in a few days. The company’s president, CEO, CFO, chief accounting officer, secretary, treasurer, and director were all the same individual: Marlon Luis Sanchez, Cynk’s sole employee. By the time Cynk’s stock-maneuver was discovered and its assets frozen Sanchez had made no additional profit, but for the investors who had been caught in the scheme, the losses were real.

Bot network detection research is being conducted by various defence agencies (including DARPA) but the field is complex, constantly changing, and yet to prove itself effective. Meanwhile, the deployment of bots on social media is within the terms of service for most of the relevant platforms, as long as no additional crime is committed their use is yet to face prosecution, and even in the case of Cynk no social media platform has assumed any kind of liability for their use.

The most active political users of social media are social movement activists, politicians, party workers, and those who are already fully committed to political causes, but recent evidence suggests that “bots” could be added to that list. Given the echo chamber effect, the fact that many online followers of political discourse are often not real users at all, and the steady decline in political participation numbers in many countries, bot use (while cheap to mobilise) may not have much power over the individual voter. Their deployment in the U.S. and Mexico has instead been largely targeted at journalists employed by mainstream media outlets. Politicians, activists, and party-workers may all find democratic scrutiny harder to achieve if the ‘public mood’ or ‘national conversation’ is being mis-reported by journalists with a bot-skewed sense of online discussion. The 2015 Global Social Journalism survey shows that in 51% of cases, reporters from six countries, including the UK and US, “would be unable to do their job without social media”. In 2012 38% of journalists spent up to two hours a day on various networks, but by 2015 that number had climbed to 57%. If unethical actors can unduly influence these avenues of online discourse, an increasingly vulnerable news-media may suffer from, and pass-on, the political biases of anonymous others.

If voting is affected by media, written by reporters who live on the internet, the shape of which is determined by anonymous, innumerable, automated agents (which no-one can track), how do we proceed in pursuit of a fair democracy?

How We Sewed Our Own Straight Jacket

Google recently announced its initiative to improve the mobile browsing experience of all net users via its Accelerated Mobile Pages project, AMP.

AMP is a direct competitor to Facebook’s Instant Articles functionality but abstracted from the platform as a standard or protocol rather than within the context of Facebook’s walled garden.

I say walled garden because of the 1.5 billion active Facebook users 30% access the service through a mobile app of some sort. Unless you’ve specifically set it up to not do so it uses an internal browser. I have no figures to hand about how many people have changed their default Facebook browser to be one of their own choosing. My instincts tell me that it’s very few. Why is this an issue? If you ask my mother or my youngest niece what the Internet is they would most likely respond “Facebook”. Facebook is becoming a platform in its own right. And content creators, journalists and publishers both are treating it as such. I’ve heard figures from online media agencies that cite numbers as high as 50% to sometimes as high as 90% conversion rates on Facebook posts. That’s a lot of eyes on articles and for some publications that’s the difference between life and death. I get it.

The AMP proposition is a sub-set of the HTML standard which eschews all JS (read none at all), ads, and embeds. I’m not saying that the spirit with which this was suggested is bad per-se but that perhaps by following our knee-jerk reaction against the popularity of Facebook’s Instant Articles we’re going to accidentally create a tiered system akin to the one that net neutrality believers are still trying to fight. Why do I mention Net Neutrality? Because AMP suggests a selection of tags specifically for a small group of preferred vendors with tags such as amp-twitter and amp-youtube. This codifies the web as it exists at the moment. Fine. For now. But what if the landscape changes? What if an unknown video streaming provider becomes the de-facto media delivery service ahead of YouTube?

Oh wait, it can’t because who’s going to use it if it doesn’t work out of the box?

One of the beautiful things about the Open Web is the ability to make your own bad decisions about what technologies you use and to badly implement them however you see fit. That’s how people learn. It’s certainly how I learned. By picking apart code and stitching my own creations together from what I thought I had gleaned. Without this ability the web becomes static, inert and unchanging.

Some of us old internet dinosaurs used to have to wrangle the then new markup language HTML 4.0, then later the better but still incredibly flawed XHTML1.0 specification before being presented with HTML5. HTML5 is great. A video is a video, audio is audio, and all of the old favourites such as iframes and objects and embeds still work with no muss or fuss. Back in the days before broadband when mobile telephones were small things that had monochrome screens and about 24 characters of space total, way before the iPhone would come along and change our lives forever there *was* a mobile internet markup language. Wireless Markup Language. WML was a pared down and fairly ugly web technology which used the idea of cards. It was pretty unpleasant. Then mobile networks caught up, we have faster than broadband wireless speeds on our handsets. They started to access the web as our desktops did. We were given CSS3 and its media queries to allow us to make all this look presentable on our pocket machines. So why the need for AMP or Facebook Instant Articles? Because we’ve bloated the web with so much tracking and third party javascript that even with 4G access pages take 8 seconds or more to load. It’s our own broken web and impatience that’s prompted Facebook and Google to try and fix it for us. But it’s as far from the Open Web as it’s possible to get. What we have are competing standards, one a proprietary initiative by a would-be-platform that seeks to become the Internet and another by a coalition of worried parties who want a language of whitelisted third-party service providers. At least that last one is Open Source and you can roll your own support if you have to.

So how do we go about solving this issue? Well one way would be to speed up web page delivery. Stop commodifying the user quite so much. Do websites really need to know where you’ve been and what you’ve clicked? I would say not. If you’ve not helped your friends and family block tracking and ad software as a matter of course you’re remit in your responsibility to their security and online safety. Ads are potentially poisonous and have been the vector for a good number of high profile malware attacks.
If you create websites push back against injecting more tracking. Write cleaner more efficient code. Use less libraries, maybe switch from jQuery to Aerogel or use vanilla JS for more things if it reduces your bloat. Optimise your images and videos. Start your design phase with a mobile first methodology. Uglify your CSS and JS (add maps to this though, you still want to a) be able to use the developers tools to read your work and b) you’re a good netizen and want people to read your output and be inspired).
From a user point of view you can install ad-blockers and tracker blockers like Adblock and Ghostery on your laptop.

There are wifi Adblockers available for mobile devices too, they will also speed your experience up. It’s up to us to keep the web free by not making the tracking of users profitable or useful.

Are there alternatives?

Yes. Sort of. A lot of this technology is in it’s infancy. So much so in fact that FBIA and AMP seem to have got the drop on Mozilla and other open source heavy hitters. One hopeful is the CPP.

This is an open note and I will be adding more points as I think of them.