Archive | advice to Ph.D. students RSS for this section

Sushi and academia

“Shiro dreams of sushi” is a great documentary, about a perfectionist sushi chef in Tokyo who earned himself three Michelin stars. Really worth watching.

In that documentary, a food critic says that great chefs have the following five qualities

  1. They take their work very seriously and consistently perform at the highest level
  2. They aspire to improve their skills
  3. Cleanliness
  4. Impatience
  5. They want things their way

He also says that what makes a great chef is to bring all of these attributes together.

This reminds me of Gladwell, who describes in his book Outliers that lots and lots of experience are often needed to be really successful. There is a certain air of perfectionism in the background, too. And interestingly, many great chefs or artists have learned their trait from scratch, even though they do now things that are really out of the box. The same holds for painters, for instance, like van Gogh or Picasso.

All this makes me wonder to what extent there is a connection to academia. Successful academics are very devoted, take their work very seriously and work very hard. They keep on learning, and they stay curious. They strive for perfection, they are ambitious. And it takes a long time and a lot of training until they are at that point (I still think that it really helps to be a good classical economist to do great work in behavioral economics. And that macroeconomists can benefit from micro theory and empirical skills.). Like it does for great chefs. So far so good.

As for impatience and stubbornness I’m less sure.

There is some impatience involved, but then, what makes an academic do good work seems to be to play the long game, to make sure that contributions are as good as they get. Attention to detail is important, and so is it not to rush. Now, if one thinks of impatience as being eager to move on, that may be true, and it may be related to academics often saying that they want time to do their research.

Last, stubbornness. Yes, academics sometimes want things their way, but what seems to be important is to strike a balance between that and what is useful to society and what the community values.

In the end of the day, there seems to be a connection to being a great chef I believe, even when it comes to those last two qualities. Academics do science for others, and the same holds for preparing a great meal. And if impatience means that one is looking forward to reaching perfection to finish a project before serving it to others, then that could fit too.

The role of economists, and plumbing

We’re sometimes accused of sitting in an ivory tower, feet up and writing in abstract terms about all kinds of things that are not directly relevant to the real world.

Well, there are academics like that, and there is certainly value to doing fundamental research that will be an important input more applied work done by others, but there are also many others.

For instance, they do consulting, and I have the impression that companies or institutions attach great value to this. There is nothing wrong with that, I believe, as long as it stays within limits and they keep doing research and keep teaching. To the contrary even, I see great potential that this will make them better teachers and researchers, because there teaching and research becomes more relevant to practice. Or they are involved in policy design and designing institutions within the scope of projects financed by third parties.

In a recent essay, Esther Duflo from the MIT has argued that attention to detail is not only interesting but really needed and useful. She suggests that economists should be more like plumbers. Worth reading, especially also for Ph.D. students who are making up their mind about the direction they want to go in.

Versioning with Git

I’ve earlier briefly described the benefits of using versioning software. In a nutshell, this is what professional coders use to collaborate and to keep track of changes they make to their code. Once you’ve set this up for conducting research projects, you usually don’t want to go back. See Gentzkow and Shapiro’s Practitioner’s guide for some guidance. Highly recommended!

I personally have used SVN for this, but over the last years Git has become more and more popular. I looked into it yesterday and it seems to me that it’s on the one hand more powerful than SVN and on the other hand easier to use. See for instance here for yourselves.

The role of empirical work

I just came across a nice article by Dan Hamermesh in a recent issue of the AER. It was discussed by Einav and Levin in another interesting publication in Science related to big data.

Einav and Levin write:

Hamermesh recently reviewed publications from 1963 to 2011 in top economics journals. Until the mid-1980s, the majority of papers were theoretical; the remainder relied mainly on “ready- made” data from government statistics or surveys. Since then, the share of empirical papers in top journals has climbed to more than 70%.

Isn’t that remarkable? I certainly was under the wrong impression when I was a Ph.D. student in Berkeley and Mannheim and thought that it’s all about theory and methods. Where does this come from? Maybe it was because one tends to see so much theory in the first year of a full-blown Ph.D. program, which is full of core courses in Micro, Macro and Econometrics, covering what is the foundation to doing good economic research. In any case, my advice to Ph.D. students would be to strongly consider working with real data, as soon as possible. There is certainly room for theoretical and methodological contributions, but this should not mean that one never touches data. At least in theory 😉 everybody should be able to do an empirical analysis. And for this, one has to practice early on. Even if one wants to do econometric theory in the end. But even then one should know what one is talking about. Or would you trust somebody who talks about cooking but never cooks himself? OK, I admit, this goes a bit too far.

After having said this let me speculate a bit. My personal feeling is that one of the next big things and maybe a good topic for a PhD could be to combine structual econometrics with some of the methods that are now used and developed in data science (see the Einav and Levin article along with Varian‘s nice piece). In Tilburg, for instance, we have a field course in big data, by the way, and another sequence in structural econometrics (empirical IO).

Ballet, van Gogh and behavioral economics

picture taken from http://commons.wikimedia.org/wiki/File:Vincent_Willem_van_Gogh_128.jpg

picture taken from here

At the recent Netspar Pension Workshop I’ve been talking to Susann Rohwedder from the RAND Corporation. We talked about van Gogh and how he spent his youth in Brabant, not far away from Tilburg. The way he was painting at that time can be described as relatively dark and gloomy and not nearly as amazing as what he produced later in his life in the south of France, with the exception of the potato eaters, probably. Here, what dominates, arguably, is good craftsmanship. What I find remarkable is that he learned painting from scratch before moving on and developing something new.

Likewise, also Picasso first learned painting from scratch, producing paintings that were well done, but way more realistic that what he is known for now. Susann remarked that also for modern dancing people often say that one should first learn ballet dancing, in order to get a good grip on technical skills, before moving on. Interesting.

This discussion made me realize that there is a strong communality with my thinking about behavioral economics. There are many people who do research in behavioral economics without ever learning classical economics from scratch, and I always wondered why they do that. Standard economic theory is the simplest possible model we can think of, and it works just fine for many questions we may want to answer. There is of course lots to be gained by studying behavioral aspects of individual decision making, as recently demonstrated once more by Raj Chetty in his Ely lecture. But I think the best way to get there is to first fully understand classical economic theory and only then build on that. In passing, another thing that Chetty pointed out very nicely was that the best way to go about doing behavioral economics is probably not to point out where the classical theory is wrong—any model is wrong, because it abstracts from some aspects of economic behavior in order to focus on others—but to ask the question how we can use the insights from behavioral economics for policy making.

Brushing up the basics and online lectures in general

Yesterday, we had Mirko Draca over as a guest, also presenting in the economics seminar. Over dinner, he mentioned that there are two main lecture series that he would recommend when it comes to learning more about time series analysis and statistics in general. They are:

  1. Ben Lambert: A large series of undergrad and masters levels short videos, including time series: https://www.youtube.com/user/SpartacanUsuals/playlists
  2. Joseph Blitzstien:  His probability course at Harvard which starts at the basics and then gos onto a lot of useful distributions and stochastic processes: https://www.youtube.com/playlist?list=PLwSkUXSbQkFmuYHLw0dsL3yDlAoOFrkDG

This reminded me of my wish to actually use online resources  more actively myself. And I would like to encourage especially Ph.D. students to actively look for interesting content on the web. It seems to me that such web lectures are tentatively underused and underappreciated, and that we usually don’t take the time to watch them as if they were real seminar talks or real lectures. However, that may be a mistake, and by making use of these resources ourselves, we may actually learn how to use the web more effectively when it comes to designing courses.

This is more broadly related to the challenges faced by universities, as described in a piece published by The Economist earlier this year.

But it concerns also conference visits. For example, most people don’t know that the plenary talks of many conferences are freely available on the internet. See here for some nice examples. All of them are highly recommended.

Correct and incorrect models

Today in class, somebody asked a question in my panel data econometrics class. The question concerned the assumption of strict exogeneity and whether it was violated in the example I gave before. I replied that yes, it could indeed be violated, but most of the time, in one way or another, a model will be mis-specified and assumptions will not hold in the strict sense. What I meant was that in some vague sense, the assumptions was a good enough approximation (without me going into the details of my example, think of the correlation between the error term and the regressor as being almost zero).

That made me think again of Milton Friedman, who argues in a famous essay that a model should be judged by its ability to predict counterfactual outcomes, or in his own words, “to make correct predictions about the consequences of any change in circumstances”. Sometimes, this is what we are after, and this is referred to as a positive approach (being able to make the right predictions)—as opposed to a normative one (where we can talk about welfare and how one can maximize it).

That sounds reasonable at first. But can we really make such a clear distinction? Can’t we simply see welfare as the outcome we would like to predict? Of course, we always need a model to make statements about welfare, but then it could also be that all models agree on the direction of the welfare effects of certain policy changes and only differ with respect to the exact magnitude. Therefore, I prefer to think of a model as a set of assumptions that are for sure wrong in the zero-one sense. But the question is how wrong, and that depends on the purpose the model is meant to serve. So, it’s a matter of degree. If the model allows me to make fairly accurate welfare statements (and I can be sure of that for whatever reasons—this is the red herring here), then I can apply Friedman’s argument that it’s good in his sense, but then I can even use if for welfare comparisons, so it serves a normative purpose. In a way, all this brings me back to an earlier post and in particular the part about Marshak.

PS on September 19, 2014: There are two interesting related articles in the most recent issue of the Journal of Economic Literature, in discussions of the books by Chuck Manski and Kenneth Wolpin, respectively. In these discussions, John Geweke and John Rust touch upon the risk of making mistakes when formulating a theoretical model, and how we should think about that.

Structural estimation in Stata

This goes to the ones who already know what they want to do, and it has to do with structural modeling. It’s about how to do this in Stata (of all places).

There are many reasons why you may want to use Stata for your empirical analysis, from beginning to end. Usually, you will use Stata anyways to put together your data set and also to do your descriptive analysis–it’s just so much easier than many other packages because many useful tools come with it. Plus, it’s a quasi industry standard among economists, so using it and providing code will be most effective.

So, if your structural model is not all that complicated, you can just as well estimate it in Stata.

Today, I want to point you to two useful guides for that. The first one is the guide by Glenn Harrison. This is actually how I first learned to program up a simulated maximum likelihood estimator. It’s focused around experiments and the situation you usually have there, namely choices between two alternatives. It’s a structural estimation problem because each alternative will generate utility, and the utility function depends on parameters that we seek to estimate.

Then, today I bumped into the lecture notes by Simon Quinn, which I found particularly insightful and useful if what you’re doing has components of a life cycle model. What I like particularly about his guide is that it explains how you would make some choices related to the specification of your model and functional forms.

Of course, there are also many reasons why you may not want to use Stata for your analysis. But in any case, it may not hurt to give it a thought.

Automating workflows

Writing an empirical paper involves—next to the actual writing—reading in data, analyzing it, producing results, and finally presenting them using tables and figures.

When starting a Ph.D., one typically imagines producing tables by means of lots of copy-pasting. But actually, I strongly advise you not to do that and instead to use built-in commands or add ons that allow you to produce LaTeX (or LyX) tables. There are at least two good reasons for this. First, it’ll save you time, fairly soon, maybe already when you put together the first draft of your paper. But at least when you do the first revision of that draft. The reason is that you will produce similar tables over and over again, because you will change your specification, the selection of your sample, or something else. And you will do robustness checks. The second reason why one wants to automate the creation of tables is that it will help you make less mistakes, which can come about when you paste results in the wrong cells or when you accidentally put too many or too few stars denoting significance next to the coefficient estimates.

Here’s an example of one way to do it in Stata and LaTeX (I usually use Stata for organizing my data, matching data sets, producing summary statistics, figures, and so on). I think the way it’s done here is actually quite elegant. This post is also useful when you’re using LyX, by the way, because you can always put LaTeX code into a LyX document.

So far this is all about generating tables. But actually, the underlying idea is that you organize everything in a way so that you can press a button and your data set that you will use for the analysis is built from the raw data, then you press a button and the analysis is run and the tables and figures are presented, and finally you press a button and the paper is typeset anew. This is described very nicely in Gentzkow and Shapiro’s Practitioner’s Guilde that I have already referred to in an earlier post. On the one hand, this is best practice because it ensures replicability of results, but on the other hand it will also save you time when you revise your paper, and believe me, you will likely have to do that many times.

Writing papers and theses

It’s August, which means that students are finishing up their research master or master theses. Here is some advice that I give most of them at one point or another, and I think also Ph.D. students may not be aware of all of the following. I’ll focus on the form for now, and will talk about the contents of a good paper at a later point in time. 

Let’s start with the very basics. You want to make your paper to be pleasant and easy to read in terms of the font size. My usual advice is to use a font like Times with a size of 11pt, to change the spacing to 1.5 or double space, and to use margins of 3cm in the top and in the bottom, and 2.5cm on the left and right.

Footnotes are usually placed after the end of a sentence, after the full stop. And acronyms should be defined before being used. You can do this by writing out the acronym and putting it in parentheses right after that. From then on you can use it. Particular sections or figures you refer to should start with capital letters. So, you would say “in the previous section”, but “in Section 3”.

Equations should only be numbered when you refer to them. Also, when you have an equation that is a “displayed formula” (so takes a whole line) and the sentence ends with that equation, then the equation should end with a full stop in it. When the text continues after the equation, then there should sometimes be a comma, for instance because the equation uses something that is defined afterwards using the expression “where”.

Overall, I think the best advice I can given is to be very careful so that the writing is of high quality. First of all, the English should be correct. There should not be any typos, and you should make extensive use of the spell checker. Then, the references should be in good order.

The following mostly applies to Ph.D. students in economics and related disciplines. When you’re writing papers, you should definitely use LyX or LaTeX together with BibTeX. Also references to figures, tables, equations, sections, and so on should be programmed so that when you change the structure of the document or insert a section or another figure, all the references are updated. This will save you a lot of time in the future, when you go through the 10th or so revision of a paper.

Generally, learn from others. In an earlier post I’ve already suggested that you should read papers in top 5 journals. Not all of them are well-written. But the chance that you get a well-written paper is higher than in other journals. Look at how introductions are structured, how the research is motivated. And spend a lot of time working out the arguments.

Andrew Chesher told me once, when I was visiting UCL as a Ph.D. student, that one may want to think about the following structure: this is what I’m doing > this is why I’m doing it and why it’s interesting > this is how I’m doing it > this is what I find. I think this is a great way to think about presenting research. He also said that academic papers should not have any superfluous written text and that for every word one should ask oneself whether it’s really necessary. Thereby, one can make text shorter and ultimately more clear.

Always make sure you use easy to understand and short sentences, mostly active tense, and that each paragraph roughly corresponds to one line of thought. But don’t be too mechanical.

Respect the reader by explaining well. Think of your reader as not being an expert on the topic you’re writing on, but as being smart and having a general education in economics. That way, you will not make the mistake of not explaining things that may be clear to you, but not to most readers.

And before I forget: many students write that “coefficients are significant”, but it should actually say that they are “significantly different from zero”.

If you want to learn more, have a look at my earlier post on the challenge of writing, where I also provide a reference to Silvia’s book. And if you’re interested in working some more on your writing, you may also want to consider having a look at a classic, the “Elements of Style“.