Tuesday, June 26, 2007

Professional writing in political science

Here is another useful article about writing in political science - reading highly recommended!

Professional Writing in Political Science: A Highly opinionated Essay

This essay is a compendium of the reactions to student writing over a long career, the kinds of ideas that are notes on critiques of numerous papers, articles, theses, and especially, dissertations. It is a set of principles and guide-lines for how to turn the product of political science research into something readable. Or to put it in the negative, it is guidelines for how not to have your work rejected because it is dragged down by the quality of your writing.

1. Attitude

Writing is hard. At its best it is bringing intelligence—to the limit of what we have—toward clarifying a world that is messy in its natural state. None of us do it as well as it could be done. It is a shame that writing is captured by the humanities and taught by those with humanistic inclinations, because the artsy idea that it is “expressive” gets in the way of understanding that it is the application of highly disciplined intelligence.

Writing is thinking. The idea that one might have good ideas but be unable to express them is, I think, wrong. If you can’t express an idea, you haven’t had it. When your prose is muck, it is delusion to believe that you have a clear understanding of a topic. If you’d had it, you could have written it.

Writing is hard also because what you know about your topic is radically different from what a reader knows and also from what he or she wants to know. To succeed in communication, you must figure out what the reader knows and wants to know. That means developing the discipline of reading your own words, from beginning to end, and simulating the reality in the reader’s mind of knowing only what can be known from your words, and in the order you have written them.

(Bad) writers often assert, “I know what I want to say and if the reader can’t figure it out, that is his or her problem!” Wrong. It is your problem. Readers who can’t follow an argument usually conclude that its author is not very smart. And, having been both author and reader, I think readers are right. So, if you have this attitude, that figuring out what your prose means is the reader’s problem, change it or prepare yourself for professional failure.

2. Structure: The Kosher Principle

Part of the rules of keeping kosher is that certain foods ought never to touch one another, that contact contaminates one or both. I have long thought that this idea is a useful principle for good scientific writing, that an exposition contains logically quite different features which ought never to touch one another because doing so contaminates their logical clarity in the reader’s mind.

I strongly suggest as a strategy for writing that these segments be written in a kosher form, that theory is never discussed in the same section, much less paragraph, as literature or design or analysis. Running them together confuses their logical structure in the minds of readers, which is probably caused by the even more dangerous confusion in the mind of the author.

Sometimes this confusion is from intent. Authors who know that they have no original theory like to write sections called “Theory” which are in fact literature review. Having many words to say about other scholar’s theories covers up the absence of original ideas in the current document.

While I don’t like the mechanical aspect of having sections named, “Problem,” “Literature,” “Theory,” and so forth, I do think that this initial structure ought to dominate the author’s conception of the writing task. These are logical requisites, each of which must be accomplished for success.

I take them up in order.

The first section of a paper is the most crucial because readers form initial judgments of the quality of a paper, proposal, or whatever on-line, and initial impressions are unlikely to change if a dull or confused introduction is later followed by brilliant writing in other sections. When I once served on the NSF panel, reading several hundred proposals a year, I compared notes with other panelists, all of whom agreed that we had made pretty solid “fund,” “no fund” decisions from reading page 1, which rarely were altered by finishing the proposal.

So if you don’t capture interest in the problem section, you’ve probably lost the game before the first inning starts. Your challenge is to present the problem, the big picture context of what your work is and why, briefly in a manner that teases the reader into wanting to know the details that come later. This is very hard to do, but its success is so critical that this should be your most careful prose, word for word getting more attention than anything else you write.

A sure fire way to lose reader interest is to begin by writing about the literature. “The literature” is a boring (but necessary) part of scientific writing. But you don’t want to lead with boredom. And how many times do we need to read an author expressing amazement or—with the dishonesty of a mortician pretending grief—expressing misfortune that his or her particular topic has not received attention in the literature. If your justification for your work is a gap in the literature, what is to follow is certain to be tedious and trivial.

One of Mike MacKuen’s former colleagues (I don’t remember for proper citation) once put it perfectly, “This paper fills a much needed gap in the literature.” That’s what I think whenever I see an author berating a gap, that there is probably an excellent reason why all previous scholars have decided that this is a problem that deserves to be ignored.

And a matter of attitude: In real science we build on what came before. So a review that asserts that all previous work is trash, the work of scholars of below normal intelligence, puts me in the frame of mind of thinking that what is likely to follow is so bad that it can only be justified against a literature that is terrible. If your contribution is good, it ought to improve on understanding that is already strong. Compare “I have a better mousetrap” to “No other mousetraps are any good.”

Or, just follow the kosher principle: The problem section is for introducing the problem—and nothing else.

2.2 Literature


Graduate students are usually pretty good at reviewing the literature, which probably explains why they nearly always overdo it. The point of a lit review is not to prove how much you know—this is not an exam. It is to lay the foundation of what is known so that you can move on to what is new. As such it should be directed to that minimum necessary for the foundation, the focus being ideas and issues, not lists of authors, articles, and books.

Since excessive length is usually an issue in scientific writing, the lit review is an excellent place to achieve economy. When you review too much literature, you not only lengthen your work but probably also undermine the real goal of building the foundation for your own innovation.

And the kosher principle—nearly always violated: Once you are done reviewing the literature, stop. It doesn’t belong anywhere else in the paper. When it appears elsewhere, in the middle of theory, design, or analysis, it usually causes confusion. The problem is that the literature is a crutch graduate student authors turn to when they want to avoid writing about those other, more difficult, issues.

2.3 Theory and Model

A suggestion: Begin the theory section with the words, “I have no theory.” which serve as a useful reminder that social research has no other purpose and the present piece should not be written so long as the statement is true.




I have said that writing is hard. Writing theory is the hardest writing there is—and the most important. It is no wonder that theory sections are usually a mishmash of literature review and design, something, anything, to cover up the embarrassment of no theory.

Theory is usually written in the subjunctive mood, statements of abstract logical relationships. “Given condition x, then pattern y should follow.” The “should” is a logical statement, not an empirical one.

2.3.1 Censoring

Two kinds of self-censoring are common in writing social theory. Authors sometimes censor the theory itself, to tailor it to predict what will be observed in the study and nothing else. Second is operationism, a common disease of political science writing, which defines concepts in terms of the indicators which measure them.

Censoring I: Fitting Theory to the Study
To restrict a theory by limiting it to only that which will be observed in the present study is harmful, robbing the theory of most of its richness. To censor a theory to predict only what will be observed usually will make it so specific as to deprive it of the logic which drives it. Such a theory, if it really remains a theory at all, is so limited that we should care little whether or not it is true. It is a hallmark of bad social science to have theories which predict exactly what will be observed in the study. All who have done social research understand that there is probably dishonesty at work when things work too well.

Any decent theory will have empirical implications vastly beyond what any study can observe. Readers understand that. It isn’t a problem. So one develops a theory in its full richness and then asserts that some small fraction of empirical implications can and should be observed. All that is needed is a brief transition statement at the end which specifies that a small set of empirical implications is observable and will reflect, if only partially, on the truth of the explanation.

Censoring II: Operationism
The building blocks—the nouns as it were—of theory are concepts. Concepts are theoretical and abstract ideas, as general as the words which form them. In a more convenient world they would match one to one with a wonderful set of indicators. In the real world concepts imply a great deal more than can be measured with even the best possible indicator. In an old usage, the “epistemic” correlation is the idea that the match is imperfect and partial, r smaller than 1.0. Part of the research task is to optimize the fit of indicator to concept, going from totally invalid to, at best, very partially valid. Part of inference is to recognize the role that low epistemic correlation plays in findings. Concepts must be presented as ideas. To do the reverse, to assert that they are what the indicators measure, is a failed strategy of science from which no number of studies can ever lead to a theory. Keeping kosher implies simply no reference whatsoever to indicators in a theory section. The issue of whether the indicators fit the concepts needs explicit treatment in the design section.


2.4 Design

The problem to which a section on design is the solution is this: there must be a connection between what theories imply and what can be observed. If a theory is general enough to be worth proposing and worth testing, then it will have sweeping implications across studies of many types and for numerous indicators. The section on design is where the censoring that was inappropriate in developing the theory becomes appropriate. The reader needs to understand how the general theory comes to ground in a specific test.

Hence what is required is delimiting the portion of the theory that is subject to testing in the current work and then detailing how theory leads to empirical implications. This can be seen as two tasks, (1) fitting theory to the design of the study, and (2) fitting theoretical concepts to observable indicators. Both are creative choices made by the author and both are subject, like any assumption, to error.

Bad writing tends to treat design decisions as if they had somehow been dictated by necessity. The honest approach is to admit that they are your own decisions and that each is subject to skepticism from reasonable readers. The author’s task to explain the logic that lead him or her to those choice sin order to bring the reader along. Readers are reasonable on average. They will accept difficult decisions about how to do things when the author lays out his or her logic. But the logic must be there.

It may be easier to say what a failed design section looks like. It is usually a list of variables, then followed by a regression. If the reader asks, as any reader should, what about those regression coefficients impinges on the truth of the theory, the usual answer is nothing. That is, what a regression implies is often only that the conditions necessary for software estimation were met, but the coefficients in the table imply nothing for the theory. The “logic” is something like, “I can do a regression and therefore the theory is true.”

2.4.1 Hypotheses

Hypotheses are the means by which the implications of the theory become translated into empirically observable facts. Good hypotheses will always have the attribute that failure implies that the theory is not true. They are worth testing for that reason only. Empirical expectations, what you know from knowledge of the data, should never be presented as hypotheses. If they are not logically linked to the theory, they are not worth testing.

Bad hypotheses usually test the author’s intuition. It needs to always be remembered that the quality of your intuition is of no interest to anyone but yourself. It just doesn’t matter if what you “think” is likely to be seen in the data ends up being seen unless that expectation is linked to theory.

Should you state formal hypotheses? I don’t have a hard-line position on this issue, but find that formal hypotheses, like unnecessary equations, often convey an attitude of pseudo-science instead of the real thing.

2.5 Analysis

Good analysis is hard writing. It needs to conquer the problem of planting expectations in the reader’s mind before he or she sees data and then working through the data presentation with some care. Authors always overestimate readers’ ability to comprehend results. Readers aren’t dumb; they just haven’t spent the thousands of hours the author spent on every detail of complex analysis. They need to be brought along.

If reviews of the literature are almost always too long, analysis is almost always too short. There are many steps, often skipped. For theoretically relevant variables, we need first to set up expectations for the size (if possible),sign, and significance of estimated coefficients. Then we need commentary on each, measuring what was observed against what was expected. Here untrained authors almost always overemphasize significance and underemphasize size. Significance is often not interesting. In large N studies coefficients tapping relatively trivial effects will usually be significant—measuring the power of much data, not the importance of the phenomenon. For crucial effects we need more. It is often extremely useful to go beyond coefficients and talk about the size of effects in the units of the dependent variable.

In another sense amateur analysts often have the reverse problem, not taking significance seriously. When you can’t reasonably exclude the possibility that the true parameter is zero, it doesn’t make much sense to go on and on about a variable’s sign and size. Fundamentally, non significance tells us that we have no reliable information about sign or size.

The rule for tables (below) is that both text commentary and tables must stand on their own. Thus prose like “Table x shows that the theory is true.” miserably fails. The statement passes all of the hard work of analysis to the reader. And it can’t be evaluated except by discontinuing reading and studying the table.

All coefficients are not equal. Often the key test of the theory will hang on one or two coefficients, with everything else included just to get the specification right. Emphasis should reflect that. Key coefficients deserve much more attention than they usually get and the others less.

Fit vs. Coefficients: Amateur analysts usually emphasize model fit, for-getting that how well the data fit a statistical model has almost no bearing on whether or not a theory is true. It is useful to remember that analysis is testing theory. What matters is what did the theory predict and did it happen or not. A similar issue arises with discussions of the relative explanatory power of variables. This question almost never impinges on the truth or falsehood of theory. It just isn’t relevant and attention given to it can only detract from what analysis ought to do, test theory.

2.5.1 Tables: Presenting Linear Model Findings

Table design is important, and often done badly. It requires you to think about what the reader knows and wants to know from your work and then very carefully lay out the table to tell the story.

A beginning point is this: APSA and the journals we write for have official rules of table styles. You need to know them and it is wise to use them when you create the table (not sometime later). Violating the official table and figure styles is a good way to advertise your amateur standing. If you want readers to think, “This was written by a graduate student,” then by all means proceed with your (or Microsoft’s) favorite table design. If you want to be treated as a serious professional, don’t.

A Rule: Tables should always be composed so that a reader can pick one up and understand its content, without having read the text. That means it must be fully self-contained, depending on nothing that is explained only in the text. The opposite is also true; a reader should be able to skip the table and understand the analysis completely from the text.

Professional type-setting practice in recent years has moved toward simplicity and away from extensive use of highlighting, i.e., all the things that Microsoft likes to do. So minimize or eliminate entirely the use of bold and italic type for various table features. Forget the pretty tables Microsoft Word will design for you; they all violate professional standards of table formatting. Also, never use vertical rules.

Table Editors: Composing a good table is a very demanding task, one that exceeds the capabilities of ordinary word processors. That’s why we have table editors. They make it possible to do the difficult layout tasks that ordinary word-processing tools cannot handle. Don’t know how to use one? You are a professional author. Learn to use the tools of authorship or choose a profession for which you are better suited.

Title: The title should convey something to the reader about the logical role and meaning of the presentation, what is being tested and how. The reader is asking, why am I looking at these numbers? and the title should answer that question. Titles tend to err on the side of being too short, of not saying enough so that readers can figure out what the numbers mean.

Do not name tables for the statistical estimator employed, another sure sign of amateur standing. A title like “Negative Binomial Estimates of ...” tells the reader that you are mighty impressed that you know about the negative binomial and have kind of forgotten what substantive purpose the table serves.

The Stub
The stub is the leftmost column in which you name the indicators for which coefficients will be presented. The usual problem is that the names are too brief to convey what the indicator is. (And remember the rule about being self-contained: if the reader needs to page back to find out what some ambiguous name stands for, you have violated the rule and caused reader impatience.) Abbreviate nothing. And never ever ever use computer variable names to stand for concepts. These are personal code words that convey no meaning to readers.

Since interpreting an unstandardized coefficient requires us to know about the measurement of the indicators, then more information is better than less. So instead of “Income,” which can be measured in several ways, use something more descriptive, for example, “Income (in $thousands)” or “Income, (ANES categories).” For dummy variables it is useful to tell us which category is coded 1. So “Gender: Female” or “Concern for Election Outcome: Medium or High.” Good descriptive stubs are usually impossible to create without a table editor. Otherwise you just can’t force sufficient content into available spaces.

Notes: There will usually be material that the reader needs to know, which is not in the body of a table, and not important enough to go in the title. So you need notes, for example, “Estimates from a negative binomial regression model.” or “* p2.5.2 Figures

Figures are called “figures” (not graphs and not charts) and have captions beneath the picture.

Useful Advice for Excel Users: DO NOT CREATE FIGURES IN COLOR. Professional publication is in black and white only. It is very important to see what your figure is going to look like when you are creating it. If you use color, you will produce a perfectly sensible picture in its original color, which then becomes unreadable, for lack of ability to discern which lines or shapes are which, in black and white. This problem is the source of many, maybe even most, of the worst published figures. It takes a lot more work to recreate a B&W figure than it does to start from scratch with one.

How? Create a master graph, changing all elements to black and white, then save it as your default. From the figure worksheet, use Chart-Chart Type, select User-defined, click Add, then name it something like B&W line graph. Then you select that chart type as the first step when you are creating graphs, so everything is in black and white from the beginning. Then you use line styles to distinguish different lines, a combination of thickness and dashing.

Defining Lines: Graphic software will produce legends (but in color, why bother, all the lines will look the same—see above). Usually they are tacky looking, another sign of amateur standing. I employ text boxes and arrows to put the explanatory content right into the picture instead. It is a fair amount of work, but so is doing an R&R or a new submission because readers thought your product looked amateurish.

2.6 Conclusions?

The first question, not asked as often as it should be, is do you need a conclusion section at all. Most of the time I think the answer should be no.

A question is raised in theory, refined in design, and answered in analysis. To answer it again is to add excess length and to insult the reader’s intelligence by the implicit assumption that he or she wasn’t swift enough to catch the point the first time. Authors who repeat themselves are likely to have angry readers.

A conclusion is appropriate mainly, I think, when it is useful to make general observations that do not follow directly from the analysis. It might a pattern not captured by any single analysis, but seen repeatedly. It might be some emergent conclusion when posterior beliefs differ from priors.

In any case, the conclusion is the second most important piece of an article for reader impact—after the problem statement. So it should never be a summary of things already written.

3. Other Issues

The usual view about perfection in all the details of a manuscript is that it is necessary for final publication only, that drafts can be imperfect in style, illustrations, and so forth. I think it is bad strategy. Although you might suffer mild embarrassment from flaws in print, it is the copy sent out for review on which success or failure as a professional will depend. Journal review is a difficult process for all concerned, the result often deeply dissatisfying. The implication is this: perfection in all its details must precede submission, not follow it. Referee judgment is unavoidably affected by small writing problems having nothing to do with the merit of the research. It is just human nature to assume that sloppiness or lack of professional knowledge anywhere is probably indicative of sloppiness and lack of professional knowledge everywhere.

Never use journal submission as a means to clean up an imperfect manuscript. The process is way too costly for that—to the profession and more importantly to you. That’s what friends are for.

I am reluctant to advise how to write, because different strategies work for different people. But unless it fundamentally violates how you work, a corollary point is that the achievement of perfection should begin with very early drafts. That way the small problems that are present are likely to get caught and fixed in revisions. If you write ugly in early drafts and then hope to achieve perfection all in one final revision, you are likely to fail. Small problems stand out against a clean background. If the draft is messy, you are likely to miss many of them.

Original article: www.unc.edu/~jstimson/Writing.pdf

No comments: