Support Skift’s Independent JournalismMake a Contribution Now
Academics have used Twitter to create a map showing London’s linguistic diversity.
By analysing tweets published in London between March and August this year and plotting them on a map the researchers were able to show the capital’s language clusters based on Twitter use.
The research, carried out by Ed Manley, a PhD student at University College London, and James Cheshire, a lecturer at UCL, shows a total of 66 different languages used in the capital.
English accounted for 92.5 per cent of the tweets, with Spanish the second most common language, followed by French, Turkish, Arabic and Portuguese.
Dr Cheshire wrote on his blog : “Towards the north, more Turkish tweets (blue) appear, Arabic tweets (green) are most common around Edgware Road and there are pockets of Russian tweets (pink) in parts of central London.
“The geography of the French tweets (red) is perhaps most surprising as they appear to exist in high density pockets around the centre and don’t stand out in South Kensington (an area with the Institut Francais, a French High School and the French Embassy).”
Other languages found include Basque, Haitian Creole and even Swahili but Mr Manley said some tweets had to be excluded. He wrote : “…Tagalog, a language of the Philippines, which initially was identified as the 7th most tweeted language. On further investigation, I found that many of these classifications included just uses of English terms such as ‘hahahahaha’, ‘ahhhhhhh’ and ‘lololololol’.
He added: “I don’t know much about Tagalog but it sounds like a fun language. Nevertheless, Tagalog was excluded from our analysis.”
The research was carried out by gathering more than three million tweets that were sent with a GPS location attached and then analysing them using Google’s translation tools. The results were then colour-coded by language to produce the map.
Since the data was gathered over the summer, Dr Cheshire wrote, “we can clearly see the many languages of the Olympic Park”.
Mr Manley added: “While languages you’d expect to score highly – such as Bengali and Somali – barely feature at all. Either people only tweet in English, or usage of Twitter varies significantly among language groups in London.”