The Twitter Languages of London

Last year Eric Fischer produced a great map (see below) visualising the language communities of Twitter. The map, perhaps unsurprisingly, closely matches the geographic extents of the world’s major linguistic groups. On seeing these broad patterns I wondered how well they applied to the international communities living in London. The graphic above shows the spatial distribution of about 470,000 geo-located tweets (collected and georeferenced by Steven Gray) grouped by the language stated in their user’s profile information*. Unsurprisingly, English is by far the most popular. More surprising, perhaps, is the very similar distributions of most of the other languages- with higher densities in central areas and a gradual spreading to the outskirts (I expected greater concentrations in particular areas of the city). Arabic (and Farsi) tweets are much more concentrated around the Hyde Park, Marble Arch and Edgware Road areas whilst the Russian tweeters tend to stick to the West End. Polish and Hungarian tweets appear the most evenly spread throughout London.

Even though the maps represent close to half a million tweets they are still prednisone based on a selective sample- they only include people who have a good location (either through GPS or a specific address) and those who are connected to the internet. I expect the latter requirement will exclude many short term visitors to London, and may explain why there aren’t so many hotspots around London’s landmarks (as is the case with Flickr where people can upload georeferenced images when they get home). In spite of this, I think the information in these maps is useful as a basis for comparison to other cities and it helps to reveal some of the finer patterns within the broad regions mapped by Fischer.

*this is slightly different to Eric Fischer’s method. He used Google’s translation tools to determine the language of each tweet whereas I have taken the stated language of each user because I am more interested in what users feel their preferred language is. I often see English tweeters post in French for example. Google also hasn’t quite mastered the slang or abbreviations that often crop up in Londoner’s tweets.