• Nikita Lumijoe

Refugees, media, and violence

Updated: Sep 16

As the number of the world’s displaced reach a historic high, it is important to reflect on the factors that influence displaced persons’ peaceful cohabitation with host populations all over the world. In our research we are trying to see whether media coverage of refugee-related issues may have some explanatory power of civil violence against refugees. The image below helps to visualize the relationship between refuge population, civil violence, and media tone.


The illustration (GIF) is based on the data from the two projects: Political and Societal Violence By And Against Refugees (POSVAR) and Global Database of Events, Language , and Tone (GDELT).

Countries on the map are scaled by color to reflect the percentage of refugee population based on host country population. The countries with a higher percentage of refugee population per capita are darker and countries with a smaller proportion of refugee population are lighter on the map.

POSVAR offers a unique collection of records of refugee-related violence that happened all around the world in the period from 1996 to 2015. POSVAR distinguishes four categories of the rates of violence against refugees: 0 (no violence); 1 (1-25 victims); 2 (26-99 victims); and 3 (100+ victims, or system violence). On the map, the size of the red dots corresponds to these values.

The last attribute of the illustration is the numbers which show the average tone of global articles (print, broadcast, and web) about refugees per country per year. GDELT generates these values by using artificial intelligence algorithms to code media articles for “tone” based on the count of certain words from precompiled dictionaries. (Leetaru, 2011) Words like “horrible” and “bad” decrease the tone score, whereas words like “wonderful” and “good” increase the tone score.(ibid.) GDELT operationalizes tone on a scale from -100 to 100, where 100 is extremely positive and -100 is extremely negative. The tone of most of the articles globally varies between -10 and 10. (GDELT, 2013)

Media Tone and Violence Against Refugees

This work is ongoing, but we have found that the relative size of the refugee population is not always a good predictor of refugee-related violence. Obviously, a higher proportion of refugee population creates greater chances for occurrence of anti-refugee violence. Nevertheless, preliminary statistical modeling of the data shows that, sometimes, high rates of anti-refugee violence can occur in the countries with a lower proportion of refugee population.

For instance, in South Africa, where the average relative size of the refugee population (0.08%) is lower than the world average (0.34%) we can observe the highest rates of anti-refugee civil violence (1.2 where mean is 0.06). In this case we can see an instance where the relative size of the refugee population alone cannot explain the civil violence against refugees—there are other factors we need to explore.

From our team’s field observations, we know that public sentiment is a major driving force in understanding host populations’ responses to refugees. As we look for a way to measure this, we’ve turned to media sentiment as a potential proxy for measuring public sentiment.

Statistical modeling of GDELT and POSVAR data shows that more negative media tone about refugees does have a connection with higher rates of anti-refugee violence, especially for lower, non-systemic violence. An example of this from the data is in Europe, where the rates of anti-refugee violence sharply increased in concert with the drop in tone of the articles about refugees.

Study Limitations

While the data provide some preliminary insights, there are limitations that we are exploring further. POSVAR is unique in its global coverage, but limited in its categorical classification of violence, which makes it difficult to disaggregate or accurately scale for population size. Additionally, it only covers 1996-2015, which limits how much data we can use from other datasets.

This limitation of time also limits our use of GDELT data to the original version; GDELT since 2015 has been updated to utilize more precise algorithms for coding and sentiment which we cannot take advantage of in the POSVAR time frame. GDELT also relies on geolocating the country of the event but does not give the origin of the source. In other words, an event that happened in South Africa described in the New York Times is geographically linked to South Africa, and therefore influencing the score of media tone of South Africa. It is not currently possible to filter the GDELT data to only reflect local news sources on events.

Moving Forward

As we continue this work, we are focusing on a deeper case study of South Africa, which ranks high on a global scale for both civil and government violence against refugees in the POSVAR dataset. By looking at the articles that drive the average sentiment up or down in the South Africa subset, and where those articles come from, we will have a better sense of how to interpret both the GDELT and POSVAR data as potential ways to understand public sentiment.

Ultimately, we hope that this leads to quantifying at least one social factor that contributes to understanding how host communities receive and respond to refugees.


Leetaru, K. (2011). Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space. First Monday, 16(9).

GDELT (2013) Data Format Codebook V 1.03. URL: www.gdeltproject.org. Accessed: August 18, 2020


Subscribe Form

  • Facebook
  • Twitter
  • Instagram

©2020 by My Site. Proudly created with Wix.com