Thursday, October 31, 2013

Analysis of Georgia's Presidential Election

Last Sunday, Georgia held presidential elections that were generally praised as competitive, free, and fair by international observers (see, for example, the OSCE preliminary report, PACE statementIRI press release, and NDI preliminary statement).* Analysis of the data was delayed by the CEC's initial decision to post only images of protocols and not the raw results in easily readable electronic form (It appears that full results are now posted on the official site along with the protocol images). JumpStartGE set up a crowdsourcing effort to convert the images to usable data, and the data were completed a few days after the election.** The data include results from 3,655 polling stations in Georgia and 52 outside of the country.

Not only is it valuable to assess Georgia's data on their own, but it is also instructive to compare them with other elections. Several indicators in Georgia differ from elections held in the South Caucasus region this year that were assessed less favorably by the observer community.

The election protocols provide three time points for turnout: noon, 5pm, and final turnout.*** The distribution of the polling station data are displayed below. Turnout is near-normal in its distribution in all three time periods, with the mean shifting right and variance increasing over time. Notably absent is the "hump" in the right side of the tail (which I pointed out as suspiciously present in Azerbaijan's data).

Distribution of Turnout, Georgia PECs
Comparing turnout to candidate performance suggests moderate tendencies for Margvelashvili to perform better at higher turnout levels, and the opposite for Bakradze (Burjanadze's data suggest no trend). As I noted in the post about Azerbaijan, this outcome could be produced by legitimate or illegitimate mobilization, or other methods. However, the data do not reveal other markers of engineering, notably polling stations with perfect attendance and complete, or near-complete, support for a single candidate. Variation in performance is also reasonably wide, whereas it was much more limited for the leading candidate in Azerbaijan.**** In the Armenian case, the slope was more pronounced, and outcomes favored the pro-regime candidate. While Margvelashvili won a substantial victory, his performance varied across polling stations.

Proportion of the Vote by Turnout

Effects of Turnout, Invalid Ballots, and Polling Station Features
Regressing the results for the three main candidates on polling station-level explanatory variables can show how multiple features are associated with performance. I used the proportion of vote received by Margvelashvili, Bakradze, and Burjanadze as dependent variables, and turnout, the proportion of invalid ballots, the natural log of polling station size, and participation by "special voters" as explanatory variables. As noted elsewhere, turnout could affect performance because of legitimate mobilization or other factors. Ballot invalidation should not be associated with candidate performance as one would not expect it to be systematically related to a candidate but rather random errors by voters.***** Polling station size could matter in a couple of ways. Smaller polling stations are more amenable to pressure on voters as officials are more likely to know individuals and they are also more likely to be in village settings (or special precincts). However, polling station size could also serve as a proxy for rural/urban location which could be connected to legitimate variation in candidate support. Special voter participation is indicative of mobile voting, and these voters are potentially vulnerable to coercion. Like station size, an alternate explanation is that special voters also serve as a proxy for the elderly and disabled who are more likely to request special conditions (and these features could also be related to legitimate candidate support). In short, the variables may have more or less benign interpretations associated with them.

Because the data include outliers, I assessed the results in several ways (standard OLS with and without the outliers, robust regression, and tobit). The significance of the coefficients and their signs did not vary across the models (except in one case noted below). For Margvelashvili, final turnout and the log of polling station size is positively associated with performance; the proportion of invalid ballots is negatively associated with performance. Special voting is not statistically significant.

For Bakradze, turnout is negatively associated with performance and the proportion of invalid ballots is positively associated with performance. Polling station size is negatively associated with performance, but is not statistically significant in two of the models. In the Burjanadze models, only polling station size had a statistically significant coefficient, and it was negatively associated with performance.

The substantive effect is not particularly large for any of the coefficients in the assessment of Margvelashvili's vote (and the model included outliers). The first figure below shows the predicted effects of invalid ballots on the expected votes for Margvelashvili. While an increase in invalid ballots is associated with lower levels of performance, the upper end of the range is unlikely to occur. The mean outcome for invalid ballots at the polling station level was 1.8% (s.d. 1.7), with a range of 0 to 49.6% (The high end is an outlier worthy of further investigation).

Predicted Outcome for Margvelashvili, Varying Invalid Ballots

The second figure shows the predicted outcome for Margvelashvili as the natural log of polling station size varies. Logging flattens the results, but nevertheless, polling station size has a small effect on outcomes. I did not even place the figure showing the effects of turnout on outcomes in this post because they are equally unimpressive substantively.
Predicted Outcome for Margvelashvili, Varying Polling Station Size
In short, while some features that could raise eyebrows are statistically related to candidate performance, their substantive effects are small.

Distribution of Digits
I have noted in previous posts that assessing the distribution of digits can be instructive in uncovering anomalous results. The distribution of the final two digits exceeds expectations from 1-1 on up, declining to around 4-0 where it returns to the expected range. The magnitude of the discrepancy (with 3% or so of the results at 1-1 whereas we would anticipate around 1%) is lower than in Azerbaijan (where 1-0 was over 7% of the results).

Distribution of the Last Two Digits

The international community has praised the election process while noting areas for improvement. The initial assessment of data aligns with these observations. Many of the troubling signals from polling station data in other elections seem to be absent, or at a lower level of magnitude, in Georgia.
  • Turnout data approach a normal distribution, with the mean and variance changing over time in ways that reasonably conform with "normal" election processes.
  • While some candidates perform better/worse at higher levels of turnout, the data are widely dispersed and do not show suspicious results where perfect (or near-perfect) turnout and vote outcomes for a single candidate converge. Moreover, in the multivariate analysis, the substantive effect of turnout was small.
  • Other features, such as the proportion of invalid ballots and polling station size, are associated with performance for some candidates. But, the effects are substantively small.
  • The distribution of digits shows some evidence of anomalous outcomes, but the scale of the anomalies is not large.
Manipulation and fraud occur in most elections, and it would be an overstatement to indicate that data reveal no evidence of anomalies. However, the data do not present markers of large-scale, systematic fraud, suggesting that any problems that might have been present are likely to have been sporadic and not decisive.


* The organizations made recommendations for improvements and expressed concerns as well, but the overall sentiment was positive.
** I participated in the effort, entering data for around thirty polling stations.
*** I restricted this part of the analysis to polling stations reporting 100% or less turnout. Ten polling stations reported higher than 100% turnout. Two were overseas stations and two were small (under 50 voters). The remaining six deserve additional scrutiny. It is likely that absentee certificate use elevated turnout in at least some cases.
**** Recall that Ilham Aliyev received over 50% of the vote in every polling station. Margvelashvili's results are more widely dispersed.
*****We could come up with a causal story, relating voters more likely to cast invalid ballots (less educated and/or older) with a specific candidate. But, this causal chain requires a bit more evidence than can be mustered with polling station data.

Wednesday, October 16, 2013

A Few Thoughts on Azerbaijan's Presidential Election

As expected, President Ilham Aliyev was awarded a third term in office after Azerbaijan's Central Electoral Commission announced his convincing victory (officially garnering 84.55% of the vote). The OSCE issued a strong statement challenging the democratic qualities of the process which the CEC deemed an "insult." Delegations from the CIS, Pakistan, and others indicated that the process was democratic. PACE statements came under scrutiny, especially on Twitter and among bloggers, with Rebecca Vincent's excellent Al Jazeera post noting the differences in professional credentials between OSCE and PACE delegations. Several other bloggers, including Arzu Geybullayeva and Katy Pearce and Farid Guliyev at the Monkey Cage, provided valuable accounts of the election process and its implications. A PR firm even weighed in at the Monkey Cage to express concerns about the lack of attention directed to Azerbaijan's "progress" in its elections.

In this post, I return to a few questions that I posed on election eve and provide a preliminary assessment of the data. OSCE reports, which tend to be based on the most thorough data collection efforts among international observation organizations,* suggest that fraud was evidenced among officials in the election apparatus, and also among citizens via vote buying and other forms of improper voting.

Turnout not only provides a sense of citizen participation, but can also provide some evidence of malfeasance. If ballot boxes are being stuffed, or voters are being directed in "carousels", turnout will be inflated. The official turnout at the national level was 72.31%, according to the CEC. The average turnout at the polling station level was 74.4%. Of the 5,492 polling stations, 68 (1.2%)  reported 100% turnout and 177 (3.2%) reported 95% or higher turnout. These results are consistent with the 2008 presidential elections.

Turnout Across Election Day Reporting Periods, CEC Data

The figure above shows the distribution of turnout across all polling stations at each reporting period on election day. At 10:00 a.m. and at noon, the distributions are bimodal, with the major mode centered around 20% and 40%, and minor modes around ten percentage points lower. As the day progresses, the distributions smooth out. The most interesting transformation is between 5:00 pm and the close of the polling stations at 7:00 p.m. The mean at closing is slightly higher than at 5:00 pm, but variance increases and a "hump" appears close to 100%. Could all of these changes happen naturally? Many of them could, especially if voters continued to come to the polls throughout the day. My cursory scan of webcams in the late afternoon showed little activity, but I was not able to systematically assess traffic. In principle, voters could have continued coming to the polls, altering the distribution and generating polling stations with near perfect attendance. But, the "hump" near 100% is suspicious as the tail should be much smaller.

Proportion of the Vote by Turnout, Aliyev and Hasanli

If we look at the distribution of votes for Ilham Aliyev and his main challenger, Camil Hasanli, alongside turnout at the polling-station level, Aliyev tends to perform better where turnout is higher. In an earlier post on Armenia's election, I noted that the relationship between turnout and performance was quite striking: the regime's preferred candidate secured a higher vote proportion at higher levels of turnout whereas the main challenger's totals were smaller at higher levels of turnout. The results are positively correlated for Aliyev (0.334, significant a the .001 level) and negatively for Hasanli (-0.050, significant at the .001 level).  These outcomes could be explained by strong mobilization efforts or by illicit actions. It is especially notable that President Aliyev did not receive under 50% in any polling station.

Invalid ballots are related to turnout, especially if one adopts a more sinister interpretation of high turnout. If ballot boxes are being stuffed, they are not being stuffed with invalid ballots, but rather those that help the preferred candidate. Just over 40% of the polling stations in Azerbaijan reported no invalid ballots. While no research precisely identifies the "natural" level of invalidation in a democratic election,** it is not uncommon for voters to make mistakes rendering their ballots invalid. The absence of invalid ballots in 40% of polling stations is notable.
Election forensics rely on expectations about the distribution of digits in naturally occurring data, and assumptions about human behavior,*** to provide some insights into anomalous results. My colleague Fredrik Sjoberg posted the following figure on Twitter, noting that the distribution of last digits suggests human intervention. The general expectation is that the last digits should be uniformly distributed, but zeros are inflated in the data from Azerbaijan.
Fredrik Sjoberg's Last Digit Analysis via Twitter @fsjoberg
If we look at the last two digits, the expectations and results are similar. The last digit combination of 1-0 is especially high given expectations (whereas we would anticipate 1-0 to appear in around 1% of the results, it instead appears around 7% of the time).

Distribution of the Last Two Digits
This outcome is likely driven, at least in part, by the marginal candidates who receive just a few votes in any given polling station. Eyeballing the data, one can notice some outcomes that appear to be peculiar, such as the frequency with which minor candidates receive the same results in some regions. For instance, if one looks at the data from District 1 in Nakhchivan, the consistency of the vote across many polling stations is notable, especially where the final three candidates receive 10, 11, and 6 votes. This is essentially an anecdotal presentation of some data, and it could be produced by chance or nature (humans seek out patterns, and randomness can produce outcomes that appear to be patterns). However, it is also notable that the results are consistent across several polling stations where the total number of votes varies - in other words, the primary variation in outcomes is the amount allocated to Hasanquliyev and Aliyev (Hasanli, the opposition candidate, receives few votes here). 

Polling Station Results for District 1 in Nakhchivan
Nakhchivan is a special place as I have noted before, and has produced unusual election results. But, a deeper spatial analysis will have to wait. My current map of Azerbaijan's districts that I developed for the last election is no longer accurate since some of the district areas have been re-drawn. I hope to have a chance to develop a new shapefile to display some of the data spatially, but it must wait for other projects to be completed first.

What are the take-aways from the preliminary data?
  • Election day turnout displays some unusual outcomes, notably the "hump" at the tail of the distribution. The distribution shows an elevated concentration of polling stations with perfect or near-perfect turnout.
  • Higher turnout is associated with a higher proportion of the vote for President Aliyev, and with poorer outcomes for Hasanli. This is consistent with mobilization and/or illicit methods.
  • Election forensics show that the last digit and last two digits are not distributed as anticipated (with the expectation being a uniform distribution). These outcomes are consistent with human intervention.
An important caveat with any assessment of election quality is that no single test constitutes a "smoking gun." However, unusual outcomes are consistent with the explanation offered by the OSCE's report that presents evidence of various forms of fraud.

* The OSCE generally has delegations large enough to gather samples across a country's territory, conducts training so that observer activities should be relatively consistent, and has a thorough questionnaire that provides the data used for the assessment. Observers are not deployed randomly, however, and the convenience sample may over-represent urban areas and problematic polling stations.
** Invalidation rates depend on many factors, including ballot design and the electoral system.
*** Most notably, the expectation is that humans are not especially talented at falsifying data. Instead of creating digits randomly, they fall into patterns that suggest manipulation. I have discussed Benford's Law and its applications in previous posts, and the basics apply here as well.

Thursday, October 10, 2013

OSCE Assessment of Azerbaijan's Presidential Election

At the OSCE press conference in Baku earlier today, the organization issued a negative report citing widespread problems on election day and during the vote counting process. Reports indicate that the press conference was disrupted by hecklers challenging the findings (including a candidate getting involved). Some of the OSCE's observations included:
  • Improper opening procedures in almost 20% of observed precincts.
  • Improper vote counting procedures in 58% of observed precincts.
  • Ballot box stuffing in 37 observed precincts.
  • Voters not checked for invisible ink in 19% of observed precincts, voters not inked in 11% of observed precincts, and voters "inked" by another station casting additional ballots in 7 observed precincts.
  • Voter lists or results protocols altered in 15 observed precincts.
Even observers from afar could find evidence of polling stations that did not seem to be following standard operating procedures. The screencap above is from a CEC webcam feed that I was watching, showing a ballot box open and ballots being counted 10 minutes after the polls closed. Many steps must be taken before ballot boxes may be opened and this commission (along with others I observed online) clearly took shortcuts.

In my assessment of the last presidential election (published in Comparative Political Studies), I noted that the data suggested evidence of "true" votes recorded by citizens, as well as the intervention of officials in manufacturing the results. Activities seemed to be decentralized, but resulted in the regime's preferred outcome. The process seems to have been even more uncoordinated during this election.

Tuesday, October 8, 2013

Re-Election Eve for President Aliyev

In just a little while, polls will open in Azerbaijan for an election that incumbent President Ilham Aliyev will win. Five years ago, I was in Baku for the 2008 election and my election eve observations are essentially unchanged from that time. The questions that will be interesting to resolve are:

1) What will be the level of recorded turnout (as well as its temporal and spatial distribution)?
2) How will results vary spatially for the losing candidates?
3) Will forensics tests reveal any anomalies, and if so, what types of anomalies?

My friend and blogger Arzu Geybullayeva (also Meydan TV) reported a couple of hours ago about a mobile app produced by the Central Electoral Commission that has received attention because it purportedly reveals the results of tomorrow's election (screencap below from Arzu's blog). Some reports have indicated that the app not only provides the national totals for candidates, but also detailed results about other aspects of the vote.

I have some lingering skepticism that this is a smoking gun for a couple of reasons. First, if the results were predetermined at this level of detail (especially if the additional reports are accurate), it would require substantial back-end "cleaning" efforts to ensure that results are "properly" reported tomorrow. This level of coordination did not seem to be present in the last presidential election. Second, it would be an inefficient way of winning. It is likely that President Aliyev will receive more legitimately cast votes than his opponents, especially due to the carefully controlled choice set. Tomorrow's vote will likely provide a strong baseline from which to manufacture any needed additional support in polling stations, district commissions, or at higher levels, to top off the win (using bureaucratic fraud, or other methods).

The CEC's Infocenter will be posting information on election day and afterward. Webcams are also online.

Thursday, October 3, 2013

20th Anniversary of a Game-Changing Constitutional Crisis

Twenty years ago, Russia faced a turning point in its nascent democracy. Following the collapse of the Soviet Union at the end of 1991, Boris Yeltsin embarked on economic reforms, putting aside political reforms for the future. Legislative institutions were still dominated by deputies elected during the Communist era, and they opposed most of Yeltsin's economic agenda. Over the course of 1993, conflict escalated between the executive and legislative branches* and encompassed not only economic decisions but basic questions of constitutional order and the rule of law.

From October 2-4, 1993, the crisis escalated: legislators occupied the Russian "White House", attempted to take control of the main television tower, and engaged in firefights in the streets of Moscow. On October 4, government forces attacked the White House with tanks and special troops, leading to a victory for Boris Yeltsin. But, the win was Pyrrhic for democracy, as Yeltsin set the precedent for extra-constitutional means to trump constitutional provisions.** He could have embarked on wide-ranging political reforms in the early days after the USSR's demise. But, Yeltsin did not recognize how important it was to create democratic political institutions and abide by them, nor did he recognize how political and economic institutions are intertwined in a democratic society.

RFE/RL has a retrospective of the main players and the implications of this tragic event.

*The Russian Constitutional Court notably supported the views of the legislature, even ruling unconstitutional measures proposed by Yeltsin in a public speech.

**At the time, the communist and nationalist affiliations of the legislators were "proof" that they were the "bad guys" and that Yeltsin, who defied Soviet authorities and helped bring about an end to the USSR was the "good guy."