Archive | September 2014

Brushing up the basics and online lectures in general

Yesterday, we had Mirko Draca over as a guest, also presenting in the economics seminar. Over dinner, he mentioned that there are two main lecture series that he would recommend when it comes to learning more about time series analysis and statistics in general. They are:

  1. Ben Lambert: A large series of undergrad and masters levels short videos, including time series: https://www.youtube.com/user/SpartacanUsuals/playlists
  2. Joseph Blitzstien:  His probability course at Harvard which starts at the basics and then gos onto a lot of useful distributions and stochastic processes: https://www.youtube.com/playlist?list=PLwSkUXSbQkFmuYHLw0dsL3yDlAoOFrkDG

This reminded me of my wish to actually use online resources  more actively myself. And I would like to encourage especially Ph.D. students to actively look for interesting content on the web. It seems to me that such web lectures are tentatively underused and underappreciated, and that we usually don’t take the time to watch them as if they were real seminar talks or real lectures. However, that may be a mistake, and by making use of these resources ourselves, we may actually learn how to use the web more effectively when it comes to designing courses.

This is more broadly related to the challenges faced by universities, as described in a piece published by The Economist earlier this year.

But it concerns also conference visits. For example, most people don’t know that the plenary talks of many conferences are freely available on the internet. See here for some nice examples. All of them are highly recommended.

Correct and incorrect models

Today in class, somebody asked a question in my panel data econometrics class. The question concerned the assumption of strict exogeneity and whether it was violated in the example I gave before. I replied that yes, it could indeed be violated, but most of the time, in one way or another, a model will be mis-specified and assumptions will not hold in the strict sense. What I meant was that in some vague sense, the assumptions was a good enough approximation (without me going into the details of my example, think of the correlation between the error term and the regressor as being almost zero).

That made me think again of Milton Friedman, who argues in a famous essay that a model should be judged by its ability to predict counterfactual outcomes, or in his own words, “to make correct predictions about the consequences of any change in circumstances”. Sometimes, this is what we are after, and this is referred to as a positive approach (being able to make the right predictions)—as opposed to a normative one (where we can talk about welfare and how one can maximize it).

That sounds reasonable at first. But can we really make such a clear distinction? Can’t we simply see welfare as the outcome we would like to predict? Of course, we always need a model to make statements about welfare, but then it could also be that all models agree on the direction of the welfare effects of certain policy changes and only differ with respect to the exact magnitude. Therefore, I prefer to think of a model as a set of assumptions that are for sure wrong in the zero-one sense. But the question is how wrong, and that depends on the purpose the model is meant to serve. So, it’s a matter of degree. If the model allows me to make fairly accurate welfare statements (and I can be sure of that for whatever reasons—this is the red herring here), then I can apply Friedman’s argument that it’s good in his sense, but then I can even use if for welfare comparisons, so it serves a normative purpose. In a way, all this brings me back to an earlier post and in particular the part about Marshak.

PS on September 19, 2014: There are two interesting related articles in the most recent issue of the Journal of Economic Literature, in discussions of the books by Chuck Manski and Kenneth Wolpin, respectively. In these discussions, John Geweke and John Rust touch upon the risk of making mistakes when formulating a theoretical model, and how we should think about that.

Structural estimation in Stata

This goes to the ones who already know what they want to do, and it has to do with structural modeling. It’s about how to do this in Stata (of all places).

There are many reasons why you may want to use Stata for your empirical analysis, from beginning to end. Usually, you will use Stata anyways to put together your data set and also to do your descriptive analysis–it’s just so much easier than many other packages because many useful tools come with it. Plus, it’s a quasi industry standard among economists, so using it and providing code will be most effective.

So, if your structural model is not all that complicated, you can just as well estimate it in Stata.

Today, I want to point you to two useful guides for that. The first one is the guide by Glenn Harrison. This is actually how I first learned to program up a simulated maximum likelihood estimator. It’s focused around experiments and the situation you usually have there, namely choices between two alternatives. It’s a structural estimation problem because each alternative will generate utility, and the utility function depends on parameters that we seek to estimate.

Then, today I bumped into the lecture notes by Simon Quinn, which I found particularly insightful and useful if what you’re doing has components of a life cycle model. What I like particularly about his guide is that it explains how you would make some choices related to the specification of your model and functional forms.

Of course, there are also many reasons why you may not want to use Stata for your analysis. But in any case, it may not hurt to give it a thought.

How Apple’s business model just became even more beautiful

Last week we saw another one of Apples wonderfully crafted presentations. What a choreography! But besides learning how to present really well (just observe how a lot of information is conveyed in a way that makes it all look so simple and clear), there was something special going on.

First of all, what was it all about? New iPhone models, Apple goes payment services (aka Apple Pay), and the new Apple Watch. Now, at least to me, it seems that the watch is dominating the press coverage. But let’s think about what may be going on in the background for the moment.

The iPhone needed an upgrade anyways. Bigger screens (the competitors had this already, and customers were asking for it), better camera, faster processor, and new technology that allows one to use the phone for super convenient payments. Good move.

Then Apple Pay. How smart is that? Apple positions itself in-between the merchants and the customer and every time somebody wants to make a payment Apple sends a request to the credit card company and the credit card company then sends the money directly to the merchant. Apple is not involved in the actual transaction, has less trouble, and cashes in anyways. Customers benefit, and merchants will want to offer the service. Good move, with lots of potential.

Finally, the Apple Watch. When you read the coverage, then you realize that the watch is actually not yet ready. Battery life is still an issue, and so is the interface. And maybe the design will still change. But there are four truly innovative features almost hiding in the background. First, it’s a fashion item, unlike all the other technical devices that are already on the market. Second, it has more technology packed into it, and third, it’ll have apps on it. Fourth, you can use it to pay, with Apple Pay.

So, what’s so special about this event? It’s all about network effects. I’ve worked on two-sided markets for a while, and there are three types of network effects that play a role. The first ones are direct network effects. These are the ones we know from Facebook: the more people are on Facebook, the more I like to be on Facebook. These ones play less of a role. The second ones are indirect network effects. They arise because app developers find it the more worthwhile to start developing apps the more users will potentially download them. This is why Apple presented the Apple Watch now. Starting from now, until the product will finally be sold, they can develop apps, which will in turn make the watch more attractive to consumers, so it will have positive effects on demand. But developers will already see that now and will therefore produce even more apps. Very smart, and all Apple has to do is to provide the platform, the app store, and cash in every single time somebody buys an app. Finally, Apple Pay. Similar model. The more people use Apple Pay the more merchants will use it, and this will make people buy more Apple devices so that they can use the services, and so on.

So, if you ask me, taken together this is a huge step for Apple. Not because the Apple Watch or the iPhone are particularly great, but because Apple’s business model is incredibly smart. Beautifully smart. And I haven’t even mentioned that sometime soon all the Apple mobile devices will be much better integrated with the operating system on their laptops and desktops. As they said in their own words, something “only Apple can do”.

Automating workflows

Writing an empirical paper involves—next to the actual writing—reading in data, analyzing it, producing results, and finally presenting them using tables and figures.

When starting a Ph.D., one typically imagines producing tables by means of lots of copy-pasting. But actually, I strongly advise you not to do that and instead to use built-in commands or add ons that allow you to produce LaTeX (or LyX) tables. There are at least two good reasons for this. First, it’ll save you time, fairly soon, maybe already when you put together the first draft of your paper. But at least when you do the first revision of that draft. The reason is that you will produce similar tables over and over again, because you will change your specification, the selection of your sample, or something else. And you will do robustness checks. The second reason why one wants to automate the creation of tables is that it will help you make less mistakes, which can come about when you paste results in the wrong cells or when you accidentally put too many or too few stars denoting significance next to the coefficient estimates.

Here’s an example of one way to do it in Stata and LaTeX (I usually use Stata for organizing my data, matching data sets, producing summary statistics, figures, and so on). I think the way it’s done here is actually quite elegant. This post is also useful when you’re using LyX, by the way, because you can always put LaTeX code into a LyX document.

So far this is all about generating tables. But actually, the underlying idea is that you organize everything in a way so that you can press a button and your data set that you will use for the analysis is built from the raw data, then you press a button and the analysis is run and the tables and figures are presented, and finally you press a button and the paper is typeset anew. This is described very nicely in Gentzkow and Shapiro’s Practitioner’s Guilde that I have already referred to in an earlier post. On the one hand, this is best practice because it ensures replicability of results, but on the other hand it will also save you time when you revise your paper, and believe me, you will likely have to do that many times.