As the semester at Harvard continues to hurtle by – already next week at this time I’ll be in Moscow for another conference – the pressure mounts on ways for me to re-think aspects of the book manuscript. Having spent many months transcribing lengthy reports, themselves filled with charts, tables, and maps about Soviet advisers’ work in provincial Afghanistan, how do I boil it all down? How to add my how historical interpretation of Soviet development in the Third World to the narrative embedded in those documents? How do I turn the quantitative information in these reports into something useful for the 21st century reader without merely recapitulating the argument made in the pages of the otchyoty (reports) that advisers from Komsomol wrote? These are all important questions not only of narrative, but also of design – how to visualize information (whether from the database or the archive) and then tie it in with the argument you’re making in prose.
Hence, I’m glad that the assignment for this week’s installment of Digital History was to read Edward Tufte’s Envisioning Information, an early 1990s manifesto arguing for several principles of design in data visualization. Tufte, as his Wikipedia page instructs, has a deep background in the subject of how social scientists, architects, historians, and, well, anyone, can present information in charts, graphs, and maps more effectively. Nor is this just a boutique field. As Tufte’s dissections of charts of 1990s-era New Jersey Transit and Metro North train timetables, information design is something we run into every day, whether we’re commuters en route to the city or consumers at a restaurant. As Tufte has pointed out, many of the design flaws in the Columbia space shuttle (which disintegrated upon return to Earth in February 2003) could have been avoided based on technical data available at the time – but a combination of bureaucratese and poor use of PowerPoint made it near-impossible even for the most eagle-eyed of NASA readers to figure out what was really going on even after a piece of debris created a large, and eventually fatal, hole in the Columbia‘s wing. “Convinced that the reports [on the damage to the shuttle] indicated no problem rather than uncertain knowledge,” Tufte argues, “high-level NASA officials decided that the Columbia was safe, and, furthermore, that no additional investigations were necessary.” It was, arguably, a design mistake that cost seven people their lives.
No matter how much we historians puff our chests, the stakes in our own graphical design aren’t quite so high as that. Still, how we present our data in charts, tables, and other graphics, matters greatly for how much we can persuade others. And at a time when historians still face inter-disciplinary pressure from (for example) quantitatively oriented political scientists, whose snazzy R graphics dazzle – even as policymakers find much of the actual quant research almost worthless – it’s important that we find a way to make the case for the interesting-ness and public (policy) value of our research as forcefully as possible. And that means good visual design.
So, what lessons can I (for one) take from Tufte’s gorgeous little manifesto? For my own work, I’d offer two observations, the one having to do with how I design my own graphics, and the second a meta-observation on the data regime (to ink a pomo phrase) of Komsomol and the Soviet Union writ large as it sought to remake institutions in Afghanistan (and perhaps the Third World more broadly). With respect to the first point, I am particularly struck by Tufte’s thoughts in Chapter Three of Envisioning Information on “layering and separation,” particularly in the context of my own amateurish attempts at mapping and historical GIS. “Effective layering of information is often difficult,” writes Tufte. “An omnipresent, yet subtle, design issue is involved: the various elements collected together on flatland interact, creating non-information patters and texture simply through their combined presence.” Maps, in other words, create unintentional visual noise due to the way they’re constructed of different data layers. The historian-cum-mapper ought to think wisely about how basic design decisions about base layers (for example) or the way we represent data (shadings, points, etc.). Otherwise she or he risks losing the reader’s attention – the most valuable asset she has to earn or lose.
A couple of examples from my own dissertation – if I may be permitted to risk losing the aforementioned reader’s attention right away – viewed through my newly-critical post-Tufte eyes, offer a couple of examples of what works and what doesn’t. While working on the D.Phil. at Oxford, I had the fortune of learning how to use MapInfo, a useful GIS tool that did everything I asked of it. I could make base maps of South and Central Asia to show elevation changes over the region.
As far as Tufte’s design principles go, I don’t think this is terrible, but there are still several issues here. Because MapInfo’s other default schemes for displaying elevation were malfunctioning at the time, I chose the color palette above. It’s, well, vivid, but it has a number of problems. Because it displays points by elevation (rather than also taking aridity into account), one gets the impression that the parched deserts of southern Afghanistan and southeastern Iran are in fact quite green, when the Persian names (“Desert of Death,” “Place of Sand”) are probably more accurate. The same applies to the lands to the west of the Indus River in what is Pakistan today – they’re dry, too, as is the coast of Baluchistan (the coastal region along the Arabian Sea). And yet it all appears lush. Not good. We also run into some trouble on the places I’ve included in the map: everything gets equal weight as a dot, regardless of whether it’s a national capital (Kabul, Islamabad …), a minor provincial city (Zahedan, the capital of Iranian Sistan and Baluchistan), or a geographical feature (the Khyber Pass or the Bolan Pass). In terms of my state ambition at the time that I drew the map – to depict this part of Asia as one arena rather than the chopped-up “place in between” it usually gets depicted at, thanks to 20th century area studies optics – the map isn’t a total failure. I wouldn’t go so far as saying that there’s a “failure to communicate” here, but the map above is what it is: part of a dissertation written by a historian who dabbles in GIS, rather than the other way around (or, better yet, an excellent historian who’s also pretty good at GIS).
Look elsewhere in the dissertation, and things get – if not ugly – then neither pretty when it comes to other maps. Take this example above, a map I was trying to make of Paktia Province, a historically unruly border province in eastern Afghanistan that plays a prominent role in the dissertation as somewhere where West German agronomists and foresters carried out all sorts of developmental interventions. Here, however, we get a loud, obnoxious clash of colors between the topographic base map (which I think works on its own to give a general picture of the region, if you exclude the whole problem of depicting aridity), and the bright red which I, for some reason unrecalled to me now, selected as the overlay to depict the location of Paktia. While the reader gets the most basic point – Paktia lies on the Durand Line – the low level of transparency of the overlay obscures a more subtle, but also essential point: the province consisted of valleys dominated by small cities. One, Gardez, relatively easy to get to from Ghazni or Kabul, but the other, Khost (the capital of Paktia then, since spun off into its own province) lay in its own valley that was easier to reach from Pakistan’s Tribal Areas. That’s a really obvious but important point when it comes to West German foresters’ assertion about the “natural” flow of Paktia’s valuable cedar trees to Kabuli, not Karachi, markets, but the well-intentioned reader has no way of figuring that out from the map, since the opacity of the red overlay obscures the local geography. We get a general (if also visually loud and obnoxious) picture of where Paktia Province is, but the map ironically falls into the same pitfall that marked so much of 20th century development – an obsession with administrative borders and administrative overlays in a way that makes the overall presentation blind towards geography and the constructed nature of some borders (but especially that of the Durand Line). I won’t beat myself up too much for this problem now – the D.Phil. had to be submitted, I had other things to do, I hadn’t read Tufte then … – but it’s important to grow mindful of these design problems if one is seeking to write digital (and analog) scholarship that sparkles.
More speculatively, one might also apply these design-centric criticisms to some of the source material I’m working with: the reports that VLKSM (Komsomol) operatives wrote from the field. There’s plenty of great information in a lot of the reports they wrote – the ethnic breakdown of the Afghan Communist Party’s youth groups, the professional breakdown of new recruits to the organization, prices for foodstuffs in local markets in Afghan Badakhshan, and so on. The problem, however, is that the information is often presented in what Tufte calls “flatland” – black and white charts in which information is too often constrained into the prison cells of the spreadsheet. Take the following randomly-selected table from a VLKSM report on the situation in a DOYA (Democratic Organization of the Youth of Afghanistan) cell from Fayzabad, Afghanistan from 1985. Overlook for a moment the foreignness of the table being in Russian and you’ll see, well, not much popping out from the page on what is supposed to be a chart of the reasons why DOYA’s membership numbers were sinking in the province during that year. (The top row depicts the names of the months of the Solar Hijri calendar in Afghanistan, which has different names from its Iranian counterpart. The chart runs from Akrab, which roughly corresponds to October, to Hamal, which ends in mid-April.)
There’s some interesting raw data going on here: we have reasons ranging from “removed from the registry” to “send to the Armed Forces” to “KhAD” (the Afghan secret police) to “recommended for the NPDA [sic] (the PDPA, the Afghan Communist Party)”, “died,” and, at the bottom, “received [into DOYA].” Yet the picture of what’s going on in the organization (a stagnation in recruiting figures, the fact that DOYA functioned a funnel for young Afghans into state institutions like the Army, KhAD, and the Sarandoy, the Ministry of the Interior’s police force , not to mention the disturbing fact of nine children being killed) isn’t so clear. One can’t be too tough on the men who wrote these reports, operating as they were from ultra-remote locations in a war zone, seldom equipped with anything more than a pen and paper to write these reports. The fact that they made it from the field to Kabul, and thence to Moscow, is impressive enough – one reason why I’m glad to work with these documents. Yet maybe we ought also to probe the ways in which VLKSM had become stuck in ineffective data design. What trends go undetected, or are insufficiently highlighted in the above presentation? Given the vast amount of paperwork that VLKSM churned out – blessing and a curse – there’s no shortage of such charts to break down. Having read Tufte and (hopefully) ingrained better design eyes now, it’s with added scrutiny that I’ll turn over the pages of these reports as I continue to slog through mucho writing, re-writing, and editing of what I hope is a project that not only reads well, but sets a high bar for “envisioning information.”