Causation and the Arrow of Time
Alexander R. Pruss
November 21, 2002
“We are always already thrown into concrete factual circumstances, facing possibilities that we need to come to grips with. By choosing some we exclude others, thus making them no longer possible. What we are thrown into is the past and present, and the possibilities loom ahead of us, though we may try to turn our back on them. The future is the home of the possibilities while the present and past define the circumstances in which we make our choices, circumstances we can no longer affect.”
Or so we might say, and there is something right about this way of talking. Our basic conception of ourselves as agents is radically temporally asymmetric. And so we adopt asymmetric metaphors for time: we talk of time having a direction, as flowing from the future, through the present, to the past, closing possibilities that once were open; or perhaps we think of time as a measure of our own movement from the past to the future. Such metaphors are of very limited value. To think of time as having a direction spatializes time; to think of time as flowing or of ourselves as moving in time is to invite the question of how fast it is flowing or how fast we are moving (at one second per second, perhaps, one might quip, thereby inviting the question of why it cannot flow at two seconds per second instead).
Although the metaphors are flawed, we need to come to grips with an inextricable asymmetry between past and future in them. As agents, we deliberate about the future, while the past we can only bewail or rejoice over. We know the future not only through theoretical knowledge, but also through the intentional knowledge that the agent has of the effects she intends to produce. We think a life that started in years of sorrow but ended in great joy is preferable to one that started with a great joy and ended in years of sorrow. We rejoice that a just-completed pain is already over more intensely than we do over the fact that a soon-to-start pain is not yet here. If we were somehow to know that someone will, no matter what we do, commit a great crime, we would find it unjust to punish her before she commits it, even if we thought justice required that she be punished for the crime had she already committed it.
If all these temporal asymmetries turned out to be merely a matter of how we see ourselves and the world, if they were something we project onto the world, we would be disappointed. We do not want to live in a bifurcated world, with our human concerns on the one side and our knowledge of the way the world really is on the other. We would have to swallow the incentive-destroying pill of realizing that our lives as agents are based on false beliefs about the way things really are.
So we want to find this temporal asymmetry in reality, either as an asymmetry in time itself or, if we cannot manage that, in the way things in time are arranged. I will argue, based negatively on the failure of alternative accounts and positively on our intuitions about how notions of past and future should be modified given relativity theory, that the best account of the asymmetry seems to be that the direction of time supervenes on the direction of the majority of the causal relations in our universe. Like Kant, then, I take causal priority to under-gird temporal priority.
But first let us turn to science to see if we can locate an important temporal asymmetry. Some disappointments. The Newtonian laws of mechanics on which classical thermodynamics are built are symmetric under time reversal. What this means is that if we take a system obeying Newtonian laws of motion and construct the time-reversal of the system by considering it as a sequence of time-slices and rearranging them in a backwards order—imagine taking a film and reversing the order of the frames—what we get is a system that still obeys the Newtonian laws. It seems, then, that if we look at Newtonian mechanics, we cannot see any principled way to distinguish one half of time as “future” and another half as “past”. Nature at its base then does not seem asymmetric in the way in which it does to agents.
What about non-classical physics? Schrödinger’s equation is just as time-reversal symmetric as Newtonian mechanics. True, on some interpretations of quantum mechanics there is a mysterious phenomenon of "collapse" in which a system in a so-called "mixed state" (for example the state that Schrödinger's cat has before we've looked in on it) non-deterministically collapses into a "pure state" (for example, the cat’s being alive). This kind of a collapse may be an asymmetric process. However, collapse is not only deeply mysterious in itself, but it is not clear how exactly it could by itself ground the intuitive asymmetries involving agency and valuation.
There are also other non-classical phenomena, dealing with time-invariance violations implied by some elementary particle experiments and by some theories of black-hole evaporation. However, again, it is not clear how these phenomena, apparently so far removed from our experience, would be relevant to our intuitions. Of course, it might turn out in some surprising way that they are.
Things are more hopeful, however, if we look at the second law of thermodynamics, a classical law, which does claim that entropy statistically tends to increase with time, unless of course one is already at equilibrium. This is obviously not a time-reversal symmetric law: if we look at a system satisfying the second law of thermodynamics, its entropy will generally be increasing and so the time-reverse of the system will generally have entropy decreasing, contrary to the second law. So perhaps we have found a temporal asymmetry in nature itself, as we wanted. Of course there is a little bit of a mystery here, too: Classical thermodynamics thinks of gases as consisting of little particles flying around and colliding under time-reversal symmetric Newtonian laws. The most promising accounts of how the entropic asymmetry emerges here are that although the laws are symmetric, the boundary conditions are not. If the universe has a random low-entropy state at one end of its time-line, then it is highly probable that entropy will increase as we move away from that end of its time-line.
But even if this is so, what does it have to do with the asymmetries that are humanly important to us? Can we ground the asymmetry of our deliberating about the future and not about the past in the fact that entropy increases from past to future? Do we consider future pains worse than past ones because there is more entropy in the future?
According to Lawrence Sklar, two kinds of answers have been supplied showing how to connect the entropic arrow of time with considerations of importance to us. The first is that our intuition that the future is in some way open comes from the fact that we can have much less sure knowledge of the future than we can of the past, and that this is somehow due to the increase of entropy. Of course, this is not so satisfactory if we think that our intuition about the future’s openness is an intuition about our agency, our power to bring about future effects, rather than about our knowledge of the future. This entropy-based account might explain why we have the intuition, but it is unclear that it could validate it. But what is worse is that the main account of how increase in entropy is bound up with knowledge of the past, as given by Reichenbach, was based on the idea that we consider regions of surprisingly low entropy to be traces of the past—e.g., a footprint on a beach—whereas as John Earman has pointed out, we use as traces of the past not only areas of low entropy, but any areas that have an unexpected level of entropy, whether too low or too high.
The second kind of story about how entropy increase might ground our ordinary intuitions of an arrow of time is David Lewis’s account of the asymmetry in counterfactuals.
There are plenty of true counterfactuals where the antecedent reports a state of affairs in the past of a non-actual state of affairs reported by the consequent—we can call these “non-trivial forwardtracking counterfactuals.” “Were I to drop this glass, it would fall.” The falling would, of course, happen after the dropping. On the other hand, counterfactuals whose consequent reports a non-actual event in the past of the antecedent are almost always false—we can call these “non-trivial backtracking counterfactuals.” It is plausible to say that there is no event now such that were this event to occur, then it would have been the case that my clothes had caught on fire yesterday (which, suppose, they did not). But there are plenty of events such that were they to happen, my clothes would catch on fire tomorrow (e.g., I could today pay someone to set them alight tomorrow).
Note that it is tempting to think the paucity of non-trivial backtracking counterfactuals is a logically trivial claim, following from the fact that were there an event A such that were A to happen, my clothes would have caught on fire, then were A to happen, my clothes would both have caught on fire and not have caught on fire. But that is fallacious: the factual claim that my clothes did not catch on fire has no business being in the counterfactual world where A happens. This would be what I call the Fallacy of Illicit Counterfactual Transference, which Richard Gale illustrates by the comedian’s joke: I hate broccoli. It would be terrible if I liked it, because then I would eat it.
There is much plausibility in saying that the asymmetry between non-trivial forwardtracking counterfactuals, many of which are true, and non-trivial backtracking counterfactuals, almost all of which are false, is capable of grounding many intuitive temporal asymmetries. If there are non-trivial true forwardtracking counterfactuals, then I can deliberate about the future in a familiar decision-theoretic way: “Were I to do A, X would occur; were I to do B, Y would occur; I prefer X to Y, so I should do A”, and such deliberation about the future can make me think of the future as open, that is dependent on my choice which is up to me. But the lack of non-trivial backtracking counterfactuals makes it impossible for me to deliberate about the past. The past is part of my thrownness: it is not something I can do anything about.
One might even try to ground some of our value-theoretic asymmetries in this. If we think, as Lewis does, that counterfactual relations are what is behind causation, then the counterfactual asymmetry implies a causal one: one can cause effects in the future but not in the past. A future pain that is certain is worse than a past pain that is certain in that normally the future depends on us to some extent, and while our impotence to remove a pain from the past is no evil at all but simply a consequence of the fact that we cannot affect the past, our impotence to prevent a future pain is particularly galling, because in general we can affect the future. Likewise, an evil followed by a good might be thought better than a good followed by an evil, because past events are sometimes causes of future ones and we might, perhaps illicitly, think this is so here: and certainly it is better that an evil should cause a good than that a good should cause an evil—since causing a good is itself a good over and beyond the caused good and causing an evil is an evil over and beyond the caused evil. Finally, punishment should be caused by the evil deed, and hence should follow it.
But if a counterfactual arrow of time would help us ground our intuitions about temporal asymmetry at least to some extent, now we need to ask why there is such an arrow. Lewis thinks he can get the counterfactual asymmetry out of the observation that even with perfectly time-reversal symmetric deterministic laws of nature, given the kind of contingent distribution of matter that our universe has, we are going to have a counterfactual asymmetry. The asymmetry is not to be found in every possible world: it won’t be found in a world consisting of one billiard ball, for instance. But allegedly it is to be found in ours.
I do not have time to go into the details of Lewis’s account. The important thing about the account is that counterfactual relations are taken to supervene on the distribution of matter in our space-time and the space-times of other possible worlds. The distribution of matter in our space-time is asymmetric, an asymmetry that is supposed to yield the above-mentioned asymmetry in counterfactuals. Unfortunately, it has now been definitively shown by Adam Elga (forthcoming) and myself (forthcoming), using independent counterexamples, that Lewis’s account of counterfactuals does not have the asymmetry Lewis claims for it, and hence that Lewis’s account of counterfactuals is indeed false.
While I do not have time to show why Lewis’s account fails—though we can talk about it during the discussion—I can say one thing. Obviously, to derive an asymmetry of time from the distribution of matter in our space-time, this distribution must be relevantly asymmetric. Lewis sees this as an overdetermination asymmetry: a present event A is overdetermined by spread out future events, each of which has the property that were that event to happen, then of nomic necessity A would happen. But there is no similar overdetermination of a present event by past events. Thus, when stone splashes into the water, the stone’s falling into the water is overdetermined by the circular waves moving outward from the impact, any one of the waves being nomically sufficient to reconstruct the splash, while it is caused by a single isolated past event—the stone’s being thrown.
However, Lewis is simply wrong. Take the above example. Let B be any one of the circular waves. Lewis claims that it is nomically impossible that B happen without A given appropriate background conditions. But this is just false. In fact, it is nomically impossible that B happen with A happening, if we keep the background fixed and remove the other waves. The reason it is impossible is in simple energy considerations: event A carries an amount of energy roughly equal to the total energy of all the waves; hence, wave B does not by itself contain most of the energy of A. Thus, it is nomically impossible that A should be followed only by B. To see this more clearly, imagine the original situation run backwards. The waves, including the time reverse of wave B converge, and their energy helps to lift the stone out of the water, creating the time-reverse of the splash-in. If, however, we had only the time-reverse of wave B converging on the stone, there would not have been sufficient energy to lift the stone. Hence, wave B does not determine the stone’s splashing in. Thus, Lewis’s overdetermination fails.
The counterfactual asymmetry under-girds, in Lewis’s system, a causal asymmetry, since a cause, at least in the simplest case, is an event A such that there is an event B so that were A not to have happened, B would not have happened. If Lewis’s account of counterfactuals and their asymmetry were correct, causation would an approximately asymmetric relation, and so causation has an asymmetry. Thus, there would be a single common root of temporal and causal asymmetries—the counterfactual asymmetry.
Lewis’s system is only one of at least three possible accounts relating the asymmetry in time to the asymmetry in causation. On Lewis’s system, the two have a common ground: the asymmetry in counterfactuals. Alternately, one might think with Hume that the asymmetry in time is prior to the asymmetry in causation, or with Kant that the asymmetry in causation is prior to the asymmetry in time. Both Hume’s and Kant’s view can plausibly be used to explicate a time-reversal asymmetric counterfactual conditional, since there are accounts of counterfactuals that inherit an asymmetry from the asymmetry of causation. For a very simple one, take Stalnaker’s account that it is true to say that were A to happen B would happen providing at the world closest to ours at which A happens, B also happens. This unfortunately fails to give a counterfactual asymmetry, for some of the same reasons that Lewis’s story about the arrow of time fails. But modify the account by saying that we are only interested in those worlds where A happens and where everything not causally posterior to A or non-A is kept fixed as in the actual world. This will give us a counterfactual asymmetry, though there are admittedly other technical problems with Stalnaker’s account. And, as I have argued above, a causal and counterfactual asymmetry can together ground many of the intuitive asymmetries. Thus, the Humean and Kantian accounts have a significant payoff.
The Humean story has roughly this shape: if A’s are always temporally followed by B’s, then A’s count as causing B’s. Simultaneous causation is already a serious problem here, but suppose we can get around it. There are other serious and familiar difficulties here. Consider the Kantian worry: How do we know which event is in our past and which is in our future? Does the past smell or feel different from the future? We might say that the past is what we remember and the future what we prognosticate about, and the deliverances of these two faculties are phenomenologically different. But then it seems that either the order of time supervenes on us, on which faculty we use for which events, and this would undercut the objectivity of the arrow of time, or else it is just a coincidence that we prognosticate about the objective future and remember the past, because we could imagine temporally reversed humanoids who do the opposite, and so we do not know that what we remember is past and what we prognosticate about is the future.
What about the Kantian story? The simplest version is that event A counts as temporally prior to event B provided that A is causally prior to B. This also runs into obvious problems. Can there not be simultaneous causation? Must every past event be causally prior to every future one? Could there not be two events, one in the past of the other, which have no causal relation between them whatsoever?
These are serious problems. Fortunately, there is a solution to them, ironically inspired by the work of Adolf Grünbaum, that great enemy of the Kantian account. Here, then, is the start of the positive Kantian account. First simplify and assume a Newtonian or Leibnizian world, depending on whether one wishes to be a substantivalist or not about time. Based on the laws of nature, we can get the idea of a space-time structure. The laws of nature require a temporal and a spatial structure. Admittedly, they may not require an ordered temporal structure. But they give rise to a structure that has a betweenness relation, a relation R that can be satisfied by a triple of events, A, B and C, and which one reads by saying that B is strictly between A and C. Specifically, the simplest formulation of Newtonian laws requires a time-sequence with a betweenness relation r defined on it satisfying some simple axioms that end up expressing the fact that the time-sequence is topologically isomorphic to the real line or a sub-interval of it. (The relation r generates a topology on the time-sequence defined by saying that a subset U of the real line is a neighborhood of a point t provided that there are points a and b such that r(a,t,b) and such that every point t* with r(a,t*,b) is also contained in U.) And then, of course, one can extend the relation r from a relation on the time-sequence to a relation R on the collection of possible events, where R(A,B,C) holds providing r(tA,tB,tC) holds, if tE is the time of event E.
The topological isomorphism of the time-sequence with the real line or a subset of the real line is not enough to define an order either on the collection of events or on the time-sequence itself. There are precisely two total orderings < of the time-sequence subject to the constraint that if a<b<c then r(a,b,c), since given one topological isomorphism f of the time-sequence onto a sub-interval of the real line, which isomorphism can be used to define an ordering on the time-sequence, we can also give a second isomorphism g defined by g(a)=-f(a).
If we want < to be the relation of temporal priority, we need to find a natural way of choosing one of these two relations—this is the problem of the arrow of time. Unfortunately, given time-reversal symmetric laws, this seems quite challenging. But it is here that the Kantian can give a plausible answer. In its simplest form, the answer is that we should choose that ordering < of the time-sequence which guarantees that if tE is the time of event E, then if A is causally prior B, then either tA<tB or tA=tB. Assuming that there is a pair of non-simultaneous events one of which is causally prior to the other, this condition uniquely determines which of the two possible total orderings is the one that gives the relation of temporal priority.
To review, in the Newtonian setting we proceed as follows. Newtonian-type laws require a time-sequence with a betweenness relation on them, so that the time-sequence is topological equivalent to a sub-interval of the real line. The topological equivalence can be used to define precisely two total orderings on the time-sequence. We can use the direction of causation to choose which ordering is the correct temporal priority ordering. Assuming causation always goes in the same direction or is simultaneous, and that not all causal relations are simultaneous, this gives us a well-defined temporal priority relation. What is crucial about this account is that we do not attempt to construct a time-sequence having a betweenness relation on it from causal relations. The time-sequence and its topology and betweenness relation is presupposed by our empirically verified laws of nature, even if these laws are time-reversal symmetric.
One weakness of this account is that it relies on the assumption that all causal relations go in the same direction, i.e., that we cannot have non-simultaneous events A and B which are simultaneous with events A* and B*, respectively, but which are such that A is causally prior to B while B* is causally prior to A*. For if we had such events, then the above definition would require the time of A to be prior to the time of B while requiring that the time of B*, which is the same as the time of A, to be prior to the time of A*, which in turn is the same as the time of B. Intuitively, one could derive the assumption that causal relations all go in the same direction from the claim, presented as a conceptual truth, that there is no backwards causation. But while the claim that there is no backwards causation had much intuitive plausibility earlir, this plausibility is lost once one sees the arrow of time as supervening on the direction of causation.
If all we have objectively given is an unordered time-sequence with a betweenness relation and a bunch of causal relations between events, and if the temporal order is to be constructed out of this, there really is little reason to think that all the causal relations have to go in the same direction. It seems conceptually possible to have non-simultaneous events A and B which are simultaneous with A* and B*, respectively, but which are such that A is causally prior to B while B* is causally prior to A*. Of course, one will have to avoid causal loops, but this can be done if events A and B are appropriately causally isolated from events A* and B*.
What are we, then, to do to derive the direction of time from the direction of causation if we have no conceptual guarantee that all causal relations go in the same direction? Well, we could just hope that we’re lucky enough to live in a universe that has all causal relations going in the same direction. Do we have sufficient empirical evidence for this assumption, though? This is not at all obvious. Remember that the laws of physics seem to be all interpretable in a causally neutral manner—indeed, they must be thus interpretable insofar as they are time-reversal symmetric. And, further, do we want to say that time have no order were the direction of causation not completely uniform?
To take care of these worries, we might simply define the direction of time as the direction of the majority of the cases of causation. Remember that the only thing we need to do is to choose between two opposite ways of ordering our time-sequence. Suppose we have some method, either rigorous and mathematical or rough and intuitive, of counting up pairs of events (e.g., an appropriate measure on the set of pairs of events or something like that). Then, if < is one of the two ways of ordering our time-sequence, we can define a set of “matched-pairs” with respect to this ordering, i.e., of ordered pairs (A,B) of events (or maybe of possible events?) such that A is causally prior to B and A<B. We can then say that the order of time is that ordering, of the two available possibilities, which has a larger set of matched-pairs associated with it—namely, that ordering which agrees with more of the causal relations between events. This would allow one to define the direction of time even if there are some exceptions to the claim that all causal relations point in the same direction.
More needs to be said, of course, about how the Kantian account extends to the relativistic case. For simplicity, work in the context of special relativity. I would suggest that we start again with an unordered space-time structure that relativistic laws of nature presuppose. We now need to define the arrow of time on that space-time structure. Again, there are precisely two appropriate ways of doing this, and to see which relation is the right one all we need to do is to define the relation for a single pair of space-time points. But we can do this simply by choosing some pair of non-simultaneous localized events one of which is causally prior to the other, and requiring that the space-time point corresponding to the causally prior event count as temporally prior to the space-time point corresponding to the causally posterior event. Again, this presupposes that causation always goes in the same direction, but maybe as before we can define the relation that agrees with the greater number of causal relations as the direction of temporal ordering.
And now let me give a positive observation that suggests that something like the Kantian approach which takes causal order to be prior to temporal order is right. We accept, either as true by definition or as always contingently true or at least as contingently true in most cases, the principle that one cannot causally affect a past event—the above constructions used this to define the order of time. Formulating this principle in a Newtonian world is unproblematic: The principle just says there is an ordering on the time line such that given any point x in space-time, events at x cannot causally affect what happens in the region of space-time whose time coordinate is less than that of x.
But how do we formulate this principle in a relativistic space-time, since the notion of the time-coordinate is now relative to the frame of reference? The most plausible formulation is: There is an appropriate ordering in space-time such that given any point x in space-time, events at x cannot causally affect what happens in the backwards half light cone centered on x, where "backwards" is defined relative to the ordering.
But why are we talking of light cones at all? What is it about light cones, rather than, say, pyramids or half-spheres, that makes them physically significant? By definition, the light cone at x is the set of all points between which and x a causal influence traveling at the speed of light or less can propagate. And I take it that this is only significant because of the relativistic principle that all physical causal influences propagate at the speed of light or less. So what makes light cones important is that they are defined in terms of regions of space-time accessible via causal relations. But once we see causality as implicit behind the very notion of a full light cone, it becomes natural to define a backwards half light cone at x as the set of points of space-time from which a causal influence can be originated that can affects what happens at x. Since it is because of causality anyway that light cones are significant, cognitive economy suggests we define direction of time in this modalized causal way. The notion of a light-cone presupposes causal concepts, and hence supposing that the notion of a backwards half light-cone does so as well adds no additional cost (unless one thinks, as Grünbaum does, that we have available to us a symmetric concept of a causal connection rather than of causality; however, I take the concept of causation to come into science from our ordinary usage and its paradigmatic cases, and we do not have an ordinary concept of a symmetric causal connection).
This causal approach answers the question of why it is so natural for us to think of the backwards light cone of an event as the relativistic analog of the past of the event, and it guides us in how we should modify the concept of the past of an event should relativity theory in turn get replaced by some other theory. As long as that theory had a recognizable space-time, the past of an event would be the set of points in that space-time events at which could affect the given event.
Final objection: If the laws of nature are time-reversal symmetric, how do we know which direction causation in fact points in, and hence how do we know what is past and what is present? Does not the above account of what, ontologically, the arrow of time is make it impossible to know what is past and what is future?
Response: If all the causal relations point in the same direction, all we need to do is to be able to tell for any one case of non-simultaneous causation which direction it points in. In the more general case, things are a little more difficult, but on the assumption that causal interactions almost all point in one direction, something like this approach may still work. (An argument for this assumption that I am not too sure of: There is some reason to think that most causal interactions point in the same direction at least locally. For, otherwise, it would seem rather likely that there could be causal loops, while there can’t.) If I intentionally bring it about that an event E should happen, then my act of willing is causally prior to E. If we can find even one case where the willing and the event are non-simultaneous, then we have enough to define an arrow of time. But I can find such a case. If I set a fireplace ablaze, I will that the fireplace be ablaze, and there is a temporal lag between my willing and the fireplace being ablaze—at first only the match is on fire, then only a few pieces of wood, and so on. My willing the fireplace to be ablaze was causally prior to the fireplace’s being ablaze, and the two were not simultaneous. This knowledge is sufficient for me to know which direction the arrow of time is pointing.
I am grateful to John Earman, Richard M. Gale and John Norton for discussions on some of these topics.
Short bibliography on the arrow of time
Bennett, Jonathan (1984), “Counterfactuals and temporal direction”, Philosophical Review, 93, 57–91.
Elga, Adam (forthcoming), “Statistical mechanics and the asymmetry of counterfactual dependence”, Philosophy of Science (supp. vol., PSA 2000).
Grünbaum, Adolf (1974), Philosophical Problems of Space and Time, 2nd edition, Dordrecht/Boston: Reidel.
Kutach, Douglas (forthcoming), “The entropy theory of counterfactuals”, Philosophy of Science.
Lewis, David (1979), “Counterfactual dependence and time’s arrow”, Nous, 13, 455–476.
Sklar, Lawrence (1993), Physics and Chance: Philosophical Issues in the Foundations of Mechanics, New York / Cambridge: Cambridge University Press.
Price, Huw (1997), Time’s Arrow and Archimedes’ Point: New Directions for the Physics of Time, New York / Oxford: Oxford University Press.
Pruss, Alexander R. (forthcoming), “David Lewis’s Counterfactual Arrow of Time”, Noûs.