# Quantum mechanics and gravity as preclusion principles of four dimensional geometries

###### Abstract

The goal of this paper is to employ a ”preclusion principle” originally suggested by Rafael Sorkin in order to come up with a relativistically covariant model of quantum mechanics and gravity. Space-time is viewed as geometry as opposed to dynamics, and ”unwanted” histories in that geometry are precluded.

1. Introduction

Rafael Sorkin in his work on interpretation of quantum mechanics (see [1]) has proposed an idea of preclusion, where global correlations of events that do not match a probability prescribed by quantum mechanics are being systematically ”precluded”. A toy example of ”preclusion” would be replacing a statement ”a dice has probability 70 percent of falling on side A and 30 percent of falling on side B” with a list of the ”precluded” behaviors: we can say that if we throw a dice 100 times then whenever it falls less than times on side A or more than times on side B then that particular pattern is precluded, where . According to this model, all of the non-precluded behaviors of a dice are equally likely, while all the precluded ones are equally forbidden, regardless of their actual probabilities. We can envision this model as a collection of parallel universes where we have exactly one universe for every non-precluded behavior.

While I don’t know for sure, to the best of my knowledge, Sorkin was using the preclusion principle dynamically: he imagined universe to be dynamically growing, and he obtained more and more precluded states as that growth continues. On the other hand, in this work I would like to apply his principle to the spacetime manifold the way it is imagined in a classical gravity: a static picture of four dimensional object where time is solely a part of geometry and there is no room for any kind of evolution, either classical or quantum mechanical. I will do that by claiming that ”wrong” behavior observed by any observer, regardless of their reference frame, is equally a ground for preclusion. Thus, the natural selection that selects the ”right” quantum correlations would be frame-independent.

According to my model, there is no such thing as an observer. Fields are well defined throughout space-time and they are all observed. However, they are adjusted in such a clever way that if some observer will ”decide” not to observe fields in a particular region, statistically he would observe the global correlations of fields on the boundary of that region to be the ones expected by quantum mechanics. There is no dynamical law that would make that happen. Instead, we simply ”preclude” all the behaviors of the fields where this is not approximately the case.

While the rigorous work is still ahead, the qualitative argument is made that the ”preclusions” can be made self-consistent if we impose a lower bound on an entropy density of the universe, so that decoherence will always occur on a certain large scale. This might be considered a side benefit since this implies third law of thermodynamics in a natural way.

This should be contrasted with Bohm’s Pilot Wave model. On the one hand, in that model there is a precise evolution equation of beables from which the desired quantum correlation follows as a result. At the same time, however, that guidence equation is written in specific frame, and thus violates a covariance principle. On the other hand, in our work we made the laws less ”rigid” by allowing more than one behavior – in fact we allow all non-precluded behaviors. This gives more flexibility to allow our version of dynamics to simultaneously be true in all frames. The price to pay, which is lack of determinism, can be dealt with by imaginning uncountably many parallel universes in which every single non-precluded behavior is realized.

Finally, I apply the preclusion principle to gravitation. I will illustrate how I can use that principle to develop two conflicting and competing models of gravity: A view originally expressed by Dyson (see [2]) according to which as long as graviton has never been observed, gravity can be viewed as solely classical phenomenon (section 5) and, in parallel, a model where gravity IS a quantum phenomenon, and in particular is quantized by means of path integrals on a causal set (section 6). The reason I pursue two opposite views is that I believe in intellectual pluralism. As long as there is at least one person on each side of a table, it means that you can’t conclusively prove the validity of one view over the other. This means that the more views a theory accomodates, the ”safer” theory it is.

The way I support Dyson’s view is as follows: since the philosophy of my work is that specific physics is NOT dynamically generated but rather is selected out through preclusion of all other physics, I don’t have to invent any microscopic dynamics of gravity that would ”generate” Einstein’s equation on a macroscopical level. Instead, I simply ”preclude” any macroscopic behavior where Einstein’s equation doesn’t happen to hold; the way I make sure I only preclude macroscopic violations and not microscopic ones is to postulate an ”approximate” Einstein’s equation where the coefficients of and are only approximately equal to these values with a well defined tolerance of approximation.

On the other hand, the way I allow the possibility of quantization of gravitational field is as follows: According to this work, all fields, gravity and otherwise, are well defined throughout space time. A quantization of any particular field does NOT imply that it fluctuates. Rather, it is a statement that if an observer choses to ignore its behavior inside a certain region, the correlations of a given field on the boundary of that region should match a quantum mechanical prediction. Since the pattern is pre-designed this way and is observer-independent, none of the ”quantized” fields are uncertain. This means that we are free to include gravity in our list of ”quantized” fields while still having a well defined geometry.

The only way in which our version of Dyson’s model (section 5) is different from our version of quantum gravity (section 6) is that in the former case the constraint involves an approximate validity of classical Einstein equation while in the latter case the constraint involves a correlation with path integrals of IMAGINED fluctuations over various regions. Neither of these constraints imply that the original gravitational field is uncertain. Thus, while I pursue both quantum-gravity and quantum-less-gravity views, unlike what one might expect, in both cases I have well defined non-fluctuation geometries some of which are being precluded.

In section 2 of this paper I will restrict myself to toy model of absolute time. While doing so, I will introduce my own version of Sorkin’s preclusion principle specifically designed in a way that would be suitable for the introduction of relativity in the rest of the paper. Then in section 3 I will introduce general relativistic covariance, but I will ignore the dynamics of gravitational field, whether it be Einstein’s equation or the action for gravity. I will focus on behavior of other fields on a fixed curvature background. Then in section 4 I will show how to extend section 3 to fermionic cases. One key ingredient is a summary of paper [9] where I defined Grassmann variables and ferionic fields as literally existing outside path integration; this will take up a first half of section 4. The other half of section 4 is devoted to ”localizing” fermions into world path, which will follow Bohm’s tradition of using position beables for fermions and field beables for bosons. Finally, sections 5 and 6 will be devoted to gravity. Section 5 will use the preclusion principle to substantiate Dyson’s view of gravity according to which gravity only exists classically. Section 6, on the other hand, will illustrate how the preclusion principle can provide a solid ground for causal set approach to quantizing gravity to work. Finally, section 7 will summarize my views of what was done so far and my ideas of possible directions for future research.

2. Toy model: non-relativistic quantum mechanics in absolute time

As we have said, we would like to follow the idea expressed by Rafael Sorkin in his work on quantum measure that deals with forbidding the low probability histories. Essentially, that means that we forbid all the global histories in which global correlation of events does not indicate a desired probability. Thus, if we have a dice whose probability of landing on one side is 70 percent, we replace that statement with a statement that we forbid all the global histories of behavior of the dice where the ratio between its two outcomes differ from 70 percent by more than some allowed margin. Since we have abandoned the notion of probability, all the patterns that are within margin are equally allowed, despite the fact that their probabilities differ. Thus, it is not a ”true” probabilistic process; however, it is globally taylored in such a way that an ignorant observer would still come up with a conclusion that it is one. We can further hypothesize an existence of parallel universes where every single ”allowed” history is realized. This hypothesis restores determinism.

An advantage of the theory is this: since we no longer believe in probability but rather in ruling out of the ”unwelcome” patterns, the fact that the set of ”unwelcome” patterns happen to coincide with the one we would expect from classical version of probability theory is mere coincident. Thus, we can replace the criteria of exclusion of histories by anything else we like that doesn’t have to even resemble probability at all. In particular, we can replace a ”classical” probabilistic correlation with quantum mechanical one without a burden of calling it probability or explaining how it works. Thus, the ”exclusion” theory has two separate benefits. Apart from the fact that it removes the notion of probability with a well determined list of parallel universes, it also allows complex probabilities to make just as much sense as the real ones.

Now lets move on to trying to impliment it to quantum mechanics. Since I consider a situation where many particles are living in absolute time, we can consider a configuration space, so that different coordinates of a point in configuration space correspond to coordinates of different particles that co-exist ”simultaneously” in time, where due to absolute time the notion of ”simultaneously” is well defined. The potential corresponding to the interaction of particles with each other in regular space will correspond to the fixed potential imposed from outside on the phase space. In the phase space picture, we have a single particle interacting with that potential living in multidimensional space.

According to my model, there are many parallel universes. In each of the parallel universes a particle is always localized, and moves along well defined trajectory, so that each of the ”allowed” trajectories is realized in one of the parallel universes. Now, we have to come up with a criteria of rulling out forbidden trajectories. In order to avoid mathematical difficulties of defining the probabilities of patterns for infinite time interval, we will make sure that time interval is finite, say it goes from to . Suppose we have a trajectory of a particle, that is to be tested for whether or not it is allowed. We are going to test it by taking every possible time interval . In each case, we will imagine some other curve, , co-existing with a curve , where the curve is the beable while the curve is quantum mechanical. I will then use the usual rules of quantum mechanics to compute probability amplitude of given an initial condition of . This would be

(1) |

and the probability density is

(2) |

We will now would like to impose a constraint of the form

(3) |

We might and might not want to make that criteria stronger by separately imposing the probability cutoffs for each time interval

(4) |

Here, is a function of the lowest attainable temperature defined in such a way that in the limit of lowest possible temperature going to , approaches infinity. The fact that I still have restrictions for time intervals greater than implies that measurements still occur on these scales, even if we don’t have a complex system. At the same time, the fact that the postulate is applicable not only to but also to, for example, implies that fields should, somehow, be adjusted in such a way that two measurements with an interval and one measurement with an interval produce the same result. For example, we can have a double slit experiment where the distance between emission of a particle and either of the two slits is and the distance between these slits and the screen is also . In this case, the constraint applied to will tell us that to compute probabilities of a particle reaching a screen we have to simply add probabilities, rather than probability amplitudes, of particle going through either of the two slits; on the other hand, if we apply that constraint to then we are forced to go back to adding probability amplitudes rather than probabilities themselves.

The only way to reconcile the two is, of course, decoherence (see [13]). If there is enough random processes going on between the emission and the detection of electrons, the interference pattern will become chaotic and average to , which means that then due to random phase shifts the probabilities predicted by sums of amplitude will coincide with the sum of actual probabilities. In order for this decoherence phenomenon to ALWAYS happen whenever we are dealing with scales larger than , we have to have some lower bound on ”entropy density” so to speak so that on the scale we always have enough of accumulated entropy for the decoherence to occur. This, of course, is equivalent to the third law of thermodynamics which, based on empirical observations, demands a lower bound of absolute temperature. According to this paper, these ”empirical observations” are explained by the preclusion of the universes where they don’t hold. Or, more precisely, it is possible to have non-precluded universe with absolute temperature, but in that case we have to ”preclude” any possible experiments (including double slit one) that would pose a problem – in other words, we would have some trivial universes, such as pure vacuum.

Now lets go back to the scales smaller than . An important question to ask is this: suppose we have a complex system that lives inside a region much smaller than ; will the decoherence effects produce measurement phenomenon still occur, despite the fact that we have only explicitly postulated that they would on scales greater than ? The answer to this, which is vital to the validity of the theory, is yes. Suppose a particule interacts with some complex system between and where . Consider a third point, . Since both and are greater than , we would expect appropriate quantum correlations both between A and C and between B and C. The ”appropriate quantum correlation between A and C” would be defined based on Schrodinger’s equation on that interval. The crucial step in the argument is that Schrodinger’s equation will include decoherence resulting in interaction with a complex system between A and B. This means that, on the one hand, we are NOT asked to make sure that A and B correlates in appropriate way; but AT THE SAME TIME we are asked to do that between B and C we will ”change our memories” in a way that at point C we ”remember” that the appropriate correlation between A and B did occur. However, due to the fact that , we can not do anything ”illegal” between B and C; in particular, we are not allowed to ”change memories”. This means that we better do have appropriate correlation between A and B, despite the fact that we were not demanded to do that by constraint. The only objection to this argument is that by going ”backward in time” from C to B one can see that different scenarios at B might correspond to the same ”memory” at C. The answer to this objection is that by considering a lot of different values of , our restrictions for B will be stricter and stricter to meet the constraints for all possible -s at the same time. While this is a guess work rather than a theorem, qualitatively it is very plausible that by doing that we would effectively restrict to what is expected on the interval, despite it never being an official constraint.

It is important to note that the ”memory” mechanism is decoherence; thus the above argument only implies to the situation where the interaction with complex system occurs between A and B. This means that we can have it both ways: on the one hand, we avoid Zeno effect if we don’t have complex system, and on the other hand we claim that if we do have a measurement apparatus in a form of complex system, we will produce both quantum measurement phenomenon as well as Zeno effect on arbitrarily small time scales, as long as they are large enough for us to ”fit” that complex system into them. The only difference between the prediction of this theory is that if there are no complex systems, we would still have a ”localization” on time scales greater than . However, since, as we said earlier, can be very large, we can always dismiss any experimental evidence of quantum effects occuring on astronomical scales by claiming that is even larger.

It is important to note that this approach is ”torn” between Bohm’s and Everett’s views. Bohm claims that particle moves along a single trajectory, which he refers to as position beable, while Everett claims that we have a wave function alone and each of its peaks which are produced by decoherence represent parallel universe. In this paper, we are sitting between two chairs: on the one hand we have postulated that particle behavior is well defined at every instance of time, which is Bohm-like; on the other hand, we never imposed its equation of motion which Bohm did, and instead replace it with preclusion, which in some sense might be seen as Everett-like. This might also be viewed as ”incomplete” Bohm’s model: we can interpret Bohm’s guidence equation as a ”preclusion” of every single trajectory except for one; we, on the other hand, left more than one trajectory un-precluded. This raises a question: can we retain all of the conclusions of Bohm’s model despite being ”incomplete”?

According to Bohm’s Pilot Wave model, a particle evolves according to guidence equation, which roughly speaking means that it moves in the direction of gradient of a wave function. If the decoherence of the wave function occurs, it will split into several branches, and the particle will end up being in one of them. Since, according to decoherence theory, the branches will not overlap in the future, the particle will stay inside one of the branches; this is the consequence of the fact that its direction of motion is parallel to the gradient of the wave function. Since in our case we no longer postulate an equation to this amount, our model makes much weaker statement regarding the behavior of a beable than Pilot Wave model does. So what we would like to ask ourselves is whether or not there is enough of an overlap to still retain the conclusion.

On the example of a dice, if we throw it infinitely many times, we have 100 percent probability that at least once we would get the same side 100 times in a row, which would by itself be considered as low probability scenario. From this we see that low probability scenarios are allowed, as long as they occur limited number of times. From this point of view, it is possible that a particle would jump from one of the branches to the other, as long as it doesn’t do it too often. However, it is important to note that if a particle jumps from branch A to branch B and then stays in branch B, then it would be a lot more than one low probability event. After all, we can choose between many different instances in time when the particle resided in branch A, and we can likewise choose between many instances when it resided in branch B, and no matter which choice we make, we would get a new example of low-probability-correlation. In fact, if we imagine that a particle stayed in both of these branches for equal periods of time, then fifty percent of choices of pairs of events will come from two different branches. Clearly, if 50 percent of events have low probability, the whole history will be forbidden. On the other hand, if we consider a history that a particle jumps from branch A to brach B and then returns to branch A right away, then as long as the period of time it resides in branch B is short enough, most of the pairs of events will be coming from the same branch, branch A, thus the whole history would have high enough probability in order to be one of the allowed ones.

Thus, the difference between outcomes of Pilot Wave model and ours is that according to Pilot Wave moded the particle is dynamically dictated to stay inside of the branch it happened to get into, while according to our model the particle can jump back and forth between branches, as long as there is one branch in which it stays most of the time. Since, by nature, any classical observation neglects microscopic process, ”most of the time” is more than enough as far as classical phenomena are concerned. Thus, we can take that weaker statement and still imply that decoherence effects does occur, just like we were hoping.

3. Relativistic bosonic field in fixed gravitational background

Throughout this section we will use the following definitions:

DEFINITION: Let and be two elements of Lorentzian manifold . We say that , and also that ”p is in a causal past of q” if and only if you can travel from to without going faster than the speed of light. Whenever either or we say that and are causally related.

DEFINITION: Let and be two subsets of . We say that if we can not find and such that . At the same time, there exist at least one pair of points and such that

DEFINITION: Let and be two points on a manifold . In this case is defined to be the length of the shortest geodesic that connects and , which is given by

(5) |

We would now generalize the approach described in the previous section for the case of space time with fixed curvature. Naively, we could have simply replaced points in time with points in space-time. However, this approach does not work. Consider, for example, a strong positive charge that attracts two negative charges to it. If we single out the location of the two negative charges, we would get an un-probable event: the two negative charges are close together without any positive charge that holds them. This means that we have to be careful to take into account all ”relevent information”. Relativistic invariance, however, provides a guidence for us as to what the ”relevent information” is, namely it is the past light cone. However, if we were to consider the entire light cone, we would have been able to come up with scenarios that yesterday has wrong correlation with today, which is okay because the day before yesterday ”fixed” it. It is true that on the example in the previous section we did just that when we confirmed that it is possible for a particle to jump from branch A to branch B of the decoherence pattern as long as it returns to branch A right away. But at the same time we also remarked that this should not be happening consistently. But if we are to take the entire light cone into account, we might as well have a CONSISTENT pattern of two conseqitive days not matching each other because of some other event in the past. The way to adress this issue is to cut a light cone with a surface. If we only had one event, then defining the surface in terms of Lorentzian distance to that eventwould imply that the surface were hyperbolically approaching light cone, giving us some unpleasant singularities. But since a lot of quantum mechanics problems are dealing with systems of size more than one point, we might assume that such is always the case which allows us to view a single event as a two different events that are spaced very closely to each other. In this case, we might define a surface in terms of a minimum of the Lorentzian distances to these two events. It is easy to see that that surface is compact. We will now generalize it to the case when the number of these points can be both finite and infinite. We will call the set of these points :

DEFINITION: Let be a Lorentzian manifold, and let be some set of events on that manifold. The past shaddow of of order is defined to be

(6) |

Likewise, a future shaddow of of order is defined to be

(7) |

I will now define the probability amplitude of the transition from field configuration on the earlier surface to the field configuration on the later surface to be

(8) |

This definition matches what is standardly done in quantum field theory. When we compute a set of propagators, for some set of fields and some set of currents , we are implicitly assuming that and are two separate interacting fields. We then compute probability amplitude associated with specific values of , namely that field having non-zero values at certain fixed points and zero values everywhere else. To do that, we integrate over everything that is non-defined, namely . Since source field is just another field, there is no reason for it to receive a ”beneficial treatment” when it comes to general interpretation of quantum mechanics. At the same time, if we treat all fields as ”source fields” there would be no ”dummy indices” left to integrate over; on the other hand, if none of the fields are sources, then everything is dummy index hence there is no meaningful information left. So instead of limitting the ”observables” to one specific set of fields, we decide to limit observables to specific set of points; in that set of points, all fields are source fields, while everywhere else all fields are dummy indices. Since shortly, mimicking the procedure done in the previous section, we will be considering all possible choices of such sets of points, our overall procedure treats all points on equal footing. This is different from what was done with source currents since in that case, we assume source field is defined everywhere – in particular, outside of the select points it is assumed to be . On the other hand, in our case we are no longer assuming that fields are outside of the fixed set of points. That is what allows us to treat all fields as ”sources” while still having degrees of freedom to integrate over.

The probability density will be

(9) |

Here, we call it because we will later introduce and as alternative ways of defining probability. Finally, we would like to impose a constraint on the global correlation which is spacetime analogue of constraint imposed in the previous section:

CONSTRAINT 1 VERSION A: Let be a d-dimensional manifold. For every number , if we have points ,

(10) |

This, however, might prove to be too restrictive given that Lorentz group is non-compact, so there are infinitely many reference frames we have to simultaneously satisfy, where a reference frame roughly corresponds to choice of points through . Since I don’t know for sure one way or the other in terms of whether it is possible to simultaneously satisfy all of these constraints, I will write an ”easier” version of the above constraint, version b. So, if in the future work it turns out that ”harder” version (version a) is too restrictive, easier version will be used; or if it turns out that harder version is not too restrictive, then harder one will be used.

In designing the easier version I will appeal to the following observation: while from the fundamental physics point of view all frames are equivalent, on practice in any given region of space-time fields are propagating with limitted velocity relative to each other. This is reflected in the following observation: on the one hand, fields are locally smooth, while on the other hand fields can vary by arbitrary large amount in the vicinity of light cone despite the fact that Lorentzian distance is arbitrary small in that region. The flip side of a coin of this argument is this: suppose we have a wave packet whose shape doesn’t change in its own reference frame (this, of course, never happens due to the nature of wave equation, but this is good enough for a simple toy model to illustrate my point). Then in the frame moving arbitrarily close to the speed of light with respect to wave packet, wave packet will undergo Lorentz contraction to an arbitrarily small size; this means that as a wave packet passes a given point, the fields change arbitrarily fast. Due to the fact that Lorentz group is non-compact, most of the wave packets should be moving arbitrarily close to the speed of light with respect to any given frame. This means that fields should fluctuate arbitrarily fast in any given frame. The fact that this is not happening implies that the fields ”picked” a frame with respect to which not to move too close to the speed of light.

Even though this sounds like violation of relativity, based on what we have just said, it is simple consequence of fields having bounded derivative. Thus, it can be enforced with the following, Lorentz covariant, constraint:

CONSTRAINT 2: , as well as any other contractions of partial derivative with itself are all bounded by some fixed large number, .

Now, going back of designing an easier version of constraint 1, we will do the following: we will let ourselves ”shift” a shaddow by a very small amount, in the direction as defined by a reference frame of that shaddow. In particular, we will let our displacements vary in different parts of the shaddow, as long as they are all bounded by some small number . We will only consider the cases when the probability stays in the specified range regardless of the shift, while discarding all other choices of points. In light of what we have said regarding fast variations in some frames, this means that we only consider the points whose common ”reference frame” doesn’t move too fast with respect to the ”ether”.

First, let us provide some definitions

DEFINITION: Let T be a set of points and let U be some set. We say that U is ”shifted past -haddow” of T of tolerance if the following is true:

a)Any element of is to the future of at least one element of but is not to the past of any of the elements of the above.

b)If and for some , then where stands for Lorentzian distance and is some small constant

c)If is any curve that connects an element of and an element of , which is always either timelike or lightlike and is directed to the future, then passes exactly one element of

We can likewise define a shifted future shaddow:

DEFINITION: Let T be a set of points and let U be some set. We say that U is ”shifted future -shaddow” of T with tolerance if the following is true:

a)Any element of is to the past of at least one element of but is not to the future of any of the elements of the above.

b)If and for some , then where stands for Lorentzian distance and is some small constant

c)If is any curve that connects an element of and an element of , which is always either timelike or lightlike and is directed to the past, then passes exactly one element of

Finally, we write an ”easier” version of constraint 1:

CONSTRAINT 1 VERSION B: Let be a d-dimensional manifold. For every number , if we have points ,

(11) |

As was previously said, the implication of our ability to select a ”shifted” shaddow is that if fields fluctuate a lot, then the variations of as we shift a shaddow would be too large to fit into any specified range, hence this would not provide any information of interest. Thus, the above constraint only applies to reference frame of small variation.

Finally, we will show an alternative procedure to ”version B” that would have the same affect: a floating lattice. This we will call version C. In this approach we note that in order for path integral to be rigorously defined, we need to employ some kind of discretization anyway. So we will take advantage of that and use the discritized shaddow surface as a replacement for continuum based ”shifted” shaddow we just introduced. The key factor in both cases is that neither ”shifted” shaddow nor ”discritized” shaddow coincides with original shaddow, but both can be regarded as its approximation in the case of fields not fluctuating too fast in reference frame defined by the shaddow. Hence, if we ignore the issue of infinitely many degrees of freedom, we can claim that if we have one of the two we don’t need the other. The advantage of ”discritized” shaddow option is the fact that we do have to remember the infinity issue at some point. On the other hand, the advantage of ”shifted” shaddow is to make a point that the above approach, in its spirit, is continuum based and infinity issues are un-related to it. So to meet both agendas I will keep both approaches as far as this paper is concernted.

DEFINITION: Let be a set of points. Its past -submanifold is a union of its past -shaddows corresponding to all

DEFINITION: Let , , and be some finite sets of points, and suppose . We say that is discritized past -shadow of and is discritized past -lattice based on if the following statements are true:

a)Every element of is causally after at least one element of the - past shaddow of and causally before at least one element of

b)Every single element of past -shaddow of is to the past of at least one element of . Likewise, every single element of is to the future of at least one element of

c)If the volume of past -submanifold of is , then the number of elements of is less than and greater than

d)If and are two elements of and Lorentzian distance between them is greater than then there are at least points that are to the future of and to the past of (where my definition of past and future includes the fact that they are timelike-separated from and )

e)If then there exist that is to the causal past of such that the Lorentzian distance between and is less than but at the same time we have MORE than points that are part of the shaddow of and are both to the causal future of and to the causal past of . This criteria assures that the reference frame defined by discritized shaddow approximates the one defined by actual shaddow and rules out the issues of lightcone singularities.

We will likewise introduce a definition of ”future” discritized shaddow:

DEFINITION: Let be a set of points. Its future -submanifold is a union of its future -shaddows corresponding to all

DEFINITION: Let , , and be some finite sets of points, and suppose . We say that is discritized future -shadow of and is discritized future -lattice based on if the following statements are true:

a)Every element of is causally before at least one element of the - future shaddow of and causally after at least one element of of

b)Every single element of future -shaddow of is to the causal future of at least one element of . Likewise, every single element of is to the causal past of at least one element of

c)If the volume of future -submanifold of is , then the number of elements of is less than and greater than

d)If and are two elements of and Lorentzian distance between them is greater than then there are at least points that are to the future of and to the past of (where my definition of past and future includes the fact that they are timelike-separated from and )

e)If then there exist that is to the causal future of such that the Lorentzian distance between and is less than but at the same time we have MORE than points that are part of the future shaddow of and are both to the causal past of and to the causal future of . This criteria assures that the reference frame defined by discritized shaddow approximates the one defined by actual shaddow and rules out the issues of lightcone singularities.

We will now define another ”modified” version of probability. Since we are employing discreteness anyway, in the computation of probability amplitude, we will be using only the degrees of freedom associated with elements of rather than the entire past submanifold. This will allow us to avoid infinity problem. We will replace vector fields with functions which correspond to path integrals of the vector fields along geodesics connecting relevent pairs of points. Gravity will be replaced by the partial ordering that will determine causal relations between these points. In papers [6], [7], [8] and [10] we have explored the way of rewriting Lagrangian in terms of the above coordinate-free quantities.

Since vector fields are now functions of pairs of points, we can’t strictly speaking make restriction that the ”fluctuated” version of a vector field agrees with original one ”only” at a discritized shaddow. Rather, our restriction should be that the two-point functions should agree for any given pair of points if AT LEAST ONE of the elements of that pair is at the discritized shaddow.

DEFINITION: Let and be finite subsets of Lorentzian manifold and suppose . If is some real valued function on the set of pairs of points of , then is a set of all other functions such that for all , and for any point , regardless whether is an element of or not.

We then define the transition amplitude in the following way:

(12) |

and

(13) |

We now define our ”version C” of the probability:

CONSTRAINT 1 VERSION C: Let be a d-dimensional manifold. For every number , if we have points ,

(14) |

4. Modifications for fermionic case

Adapting the above approach for fermionic case is tricky because if we are to assume the existance of fixed distribution of fermionic field, as was done for the bosonic case, this requires defining Grassmann numbers outside of integration. This I have already done in [9] In that paper, I have defined Grassmann numbers to be elements of a space equipped with commutting dot product (), anti-commutting wedge product () and a measure that has both positive and negative values. The vector space is multidimensional with unit vectors , , etc. and these are ordered. The relation between products is for (which means for ). We further define , , , , etc. We also have a measure that satisfies

(15) |

In the previous paper, it was shown that in this case

(16) |

as long as on the left hand side is expressed in terms of wedge products.

Furthermore, in that paper we defined a spinor field in a more geometric way. In particular, we have considered two scalar fields, and (one for particle and one for antiparticle) and a four non-orthogonal vierbines, , , and . We assume that in the reference frame defined by these non-orthonormal vierbines, our spinor field is given by From these four non-orthonormal vierbines, we use Gramm Schmidt process to derive the four orthonormal ones: We then rotate a spinor from the u-based coordinate system to v-based one. This will give us a complete spinor whose all four components are non-zero. Thus, we have a function that takes four non-orthonormal vierbines and two real scalar fields and returns a spinor field. We will then define a measure on the domain of in the following way:

(17) |

After having done these modifications, we will then do the following replacement:

(18) |

Here, are just constant vectors; they are NOT unit vectors corresponding to any variable. Thus, they do NOT imply that we are going back to the view of being independent variables.

From this point on we can simply copy everything that was done in the previous section, taking these modifications into account.

However there is one more, unrelated, thing that we might want to change for fermions. As we know from Bohm’s model, while field beables are more convenient to use for bosons, position beables are more convenient to use for fermions. Among them is the fact that fermions, rather than bosons, are the main ingredients of molecules that make up the objects that are localized in space. Personally, I have not decided for sure whether I would like to turn field beables into position beables when it comes to fermions. But just in case I would like to do that, I can potentially impose a constraint which demands that, while my field is well defined at every single point in space, it is close to outside of a vicinity of some set of curves.

CONSTRAINT 3: We still have fields , and throughout space time. But the set in which and is non-zero looks like a set of piecewise continuous curves and piecewise differentiable curves whose number is no greater than . While these curves are timelike at every point, they go both forward and backward in time. If I will orient them in a particular way, these curves have the following property: each of these curves starts at a point that is not to the future of any other point, and ends at a point that is not to the past of any other point (in other words, it starts at a ”past” surface of space-time and ends at a ”future” point of space-time) The total length of each curve is bounded by (by making greater than the separation of past and future surfaces of the universe, but not too much greater, I will assure that most of each curve is taken up by future directed pieces). On the future directed piece of each curve, while on the past directed piece of each curve . The future directed pieces are interpreted as particles, while the past directed pieces are interpreted as antiparticles. Their junctions are interpreted as pair creation and pair annihilation.

There is no curve that represents a photon that produces pair creation or is produced by pair annihilation. We view both electrons and photons similarly to classical physics: electrons are localized while electromagnetic field that they produce fills the entire space. This agrees with the traditional approach to Pilot Wave model, where position beables are used for fermions while field beables are used for bosons. However, while pair creation or annihilation doesn’t have photonic lines attached to it, there has to be a correlation between the strength of electromagnetic field and the frequency of these creation/annihilation events, which I hope will be produced once the model is investigated numerically.

PLEASE NOTE: the curves that were given are NOT aforegiven. Rather, I am saying that a given field configuration is allowed if we can FIND a set of curves FOR THAT SPECIFIC FIELD CONFIGURATION. In other words, there are no apriori structure of a manifold in a form of curves, so no symmetries, have been violated.

It is easy to see that if both and are small, then all four components of a vector produced by rotation of from non-orthonormal to orthonormal frame will still be small, in other words, will be small regardless of values of .

Thus, what we see above is that we are making sure that the spinor field is small, unless we are close to at least one pair of points on a segment of a curve. The reason we need more than one point is that due to the fact that Lorentzian distance is on the vicinity of a light cone, the whole space-time will meet a criteria of ”close enough” if it was based on a single point. For that same reason, we impose a constraint that the Lorentzian distance between these pairs of points should be greater or equal to . By imposing two separate inequalities for and we have also made sure that is small away from vicinity of future-directed segments while is small away from the vicinity of past-directed segments.

Two things are worth noting:

1) While and are outside of a specific set of curves, can be anything we like throughout the whole space. This means that we can use to answer a question as to how come the space attains manifold structure, and we can do so while still believing fermions are localized in space. We will talk more about it in section 6 on causal set theory.

2) The fact that we have constrained our curves to start at the point that is not causally after any other point and to end at a point that is not causally before any other point, we essentially made sure that more than 50 percent of each curve is taken up by a particle region rather than antiparticle one. Furthermore, by limitting the length of each curve by we found a way of making the ratio of antiparticle part of each curve to particle one as small as we like by making sufficiently small. This can be an explanation as to why our universe is matter dominated.

5. Possible validity of Dison’s view that gravity does not have to be quantized

Let us now see how can I apply my model of quantum mechanics to support the Dyson’s view (see [2]) that since gravitons were never observed, we don’t have to quantize gravity. The alternative of quantizing gravity will be explored in the next section.

We will start from the bosonic case when we won’t have to worry about black hole singularities produced by point particles, and then we will move onto fermionic case.

As we seen in the above sections, we have a set of parallel universes, in each of them we have a manifold with well defined metric, on which all fields, both fermionic and bosonic, are likewise well defined. We then imposed some relativistically invariant, albeit global, constraints that determined whether each particular parallel universe is ”allowed” or ”forbidden”. We then claimed that if we are living in one of the ”allowed” parallel universes, we would observe the expected quantum effects.

Now, since in each parallel universe all fields are well defined, we might as well impose another constraint on whether or not a universe is ”allowed” or not. Namely, that Einstein’s equation is approximately satisfied. We can’t say that Einstein’s equation is satisfied exactly. After all, by Bianchi identities,

(19) |

but at the same time, due to the quantum fluctuations

(20) |

But due to the fact that quantum processes are small, and thus have small gravitational field, we can still say that Einstein’s equation holds approximately. By approximately I mean the following:

(21) |

where is a small perturbation to , where by ”small” I mean that it satisfies constraints

(22) |

Thus, by making and to be of a magnitude of gravitational fields of classical objects, albeit very small, we will assure that there is no local correlation between fluctuation of gravitational field in microscopic level and any of the microscopic processes that occur. In other words, quantum mechanical particles don’t gravitate, which supports Dyson’s view that there is no such thing as quantum gravity, or graviton.

This raises another question: what if we have a large body of mass which explodes and particles fly out in many different directions very far away from each other. Then, each individual particule, due to being very far from all the other particles, won’t gravitate. At the same time, if their final distribution is regarded to be spherically symmetric, we can draw a very large sphere that encompasses all of these particles, and we would expect gravitational field on that sphere to be described by Schwartschild solution. This is something we can’t deny since that can be potentially supported by experiments, albeit being hard to arrange. To answer this question lets introduce some names. We will call actual energy momentum tensor, and we will call an ”apparent energy momentum tensor”. Thus, by Bianchi identity, apparent energy momentum tensor is conserved while actual one is not. Now, back at the time when the mass have not spread out yet, we agree that due to its density being sufficiently large, both apparent and actual energy momentum tensors were large, and they were approximately equal to each other. Then, after the mass spread out, the actual energy momentum tensor became very small, which means that the apparent one no longer needs to have any local correlation with an actual one. However, due to the fact that apparent energy momentum tensor USED to approximate an actual one when they both WERE large, the conservation of apparent energy momentum tensor demands that it should look like some kind of distribution of the latter throughout space. True, that distribution doesn’t have to correlate with specific behavior of particles that are being spread out. But since we can not measure the behavior of these particles, as long as this distribution obeys the conservation law, there is no way we can ”get caught” with violation of general relativity.

Now one thing to adress is how do we avoid black hole singularity of gravitational field produced by fermions, if we take a view that they are localized into curves. The way I avoid this is that when I write for fermionic field, I replace and with and which are ”blured out” version of original fields, that is, they are non zero in a close vicinity of a curve. They are defined as follows:

DEFINITION: Let be a point on a manifold . Let (index stands for particle) be a set of pairs of points such that they both fall on the same future-directed piece of a curve, and are also both lightlike separated from . If contains no elements, then . On the other hand, if it does contain some elements, then we let to be a subset of that minimizes Lorentzian distance between the pairs of elements (so with probability 1 it will be one element set but I just do it for the sake of generality). Let be the Lorentzian distance between two elements of any of the pair belonging to . Then

(23) |

, where is some function whose value is outside of vicinity of .

The reason we define the vicinity of a curve in terms of two points rather than one point is that due to the fact that Lorentzian distance is in the lightcone of one point, we would have been able to fill the whole space. On the other hand, by considering two points that are defined by intersection of lightcone comming from with a curve representing a path of a fermion, we make sure that point is very close to a curve, in a reference frame defined by that curve, which is what we want.

We will now do the same definition for anti-particle. This we do by cut and paste of definition above, while replacing with , ”future-directed segment” with ”past-directed segment”, with and with :

DEFINITION: Let be a point on a manifold . Let (index stands for particle) be a set of pairs of points such that they both fall on the same future-directed piece of a curve, and are also both lightlike separated from . If contains no elements, then . On the other hand, if it does contain some elements, then we let to be a subset of that minimizes Lorentzian distance between the pairs of elements (so with probability 1 it will be one element set but I just do it for the sake of generality). Let be the Lorentzian distance between two elements of any of the pair belonging to . Then

(24) |

As far as the vierbines, they will be left un-changed, since I have no reason to expect them to be singular in vicinity of the curve:

(25) |

The important thing to notice is that fermionic action should be based on rather than and AT THE SAME TIME rather than for gravity. If we were to use for fermionic action we would likely run on a famious problem from classical physics as to why doesn’t the electron explode due to same-charge repulsion. At the same time, if we were to use for gravity, we would end up with a black hole. But by using for fermionic action and for gravity, we have basically said that while the gravitating matter of a fermion is distributed over some small region, its charge is all concentrated at the center (the latter being pointlike). This means that neither same charge repulsion nor black hole happens.

. 6 Possible quantization of gravity through causal set approach

We will now explore an alternative where we do quantize gravitational field. While there are a lot of different models for quantum gravity, and it is certainly worthwhile to explore the possible implications of our approach to each one of them, for the purposes of this paper we will limit ourselves to causal set theory. The reason I made this choice is because causal set is a discritized space-time whose microscopic structure is manifestly Lorentz invariant outside of path integration. Since this paper is built on the concept of existance of well defined field beables at every point that are relativistically invariant, if we are to discritize our space-time at all, we better do so without violating the relativistic invariance that we strive to preserve. This makes causal set approach is an ideal one.

A causal set, which was first introduced by Rafael Sorkin (for reviews see [3] and [4]) is any set with partial ordering , where we interpret its elements as physical events, while the partial ordering we interpet as causal relations between these events. If this means that is in a causal past of . That set is assumed to be discrete with respect to partial ordering. That is, if then there are only finitely many elements that satisfy . No other structure, including coordinate system, are assumed. This was motivated by the observation made by Hawking where he have shown that if we have Lorentzian manifold we can describe its metric, up to Weil scaling, on the basis of causal relations alone. In the discrete scenario, Weil scaling can be defined by the simple count of points on any given region. Thus, for discrete case, causal relations will give us a complete information about metric. The motivation for a causal set is that it is a way to discritize space time without violating Lorentz covariance. If we try, for example, to discritize space time using lattice then edges of a lattice will be different from the diagonals, making one direction ”better” than the other. On the other hand, in case of causal set, since the only structure is a causal relation, which is manifestly covariant, the covariance holds by definition.

In papers [6], [7], [8] and [10] , we proposed a way to define fields and their Lagrangians for a causal set. Scalar fields are defined in a usual way, as complex valued function. Vector fields are defined in terms of real valued functions on a set of pairs of points of a causal set (in case of a manifold we interpret them as a path integral of a vector field along geodesic that connects these two points) . Gravitational field is defined in terms of causal structure itself, by appealing to Hawking’s observation. And finally spinor field are described in terms of four vector fields and two scalar fields in the same way as described in previous sections of this paper as well as [9] and [10]. Since vector fields are defined in terms of real valued functions on a set of pairs of points, this unltimately implies that fermions are described in terms of combination of real valued functions on a set of single points ( and ) together with real valued functions on a set of pairs of points (the ones corresponding to . This means that if we put together scalar field, gauge field, fermionic field, and gravity, we have an action of the form

(26) |

where are real valued functions on a set of single points and are real valued functions on a set of pairs of points. The actual explicit equation for you can find in these papers.

However, while the actions were defined, one question was not adressed: what to do with these actions? After all, since causal relations are viewed as a gravitational field, it is essentially a dummy index we are integrating over, so no afore-given causal relations are defined. Since causal relations are the only defining feature of topology, this means that any pair of elements of causal set is apriori just as far or just as close to each other as any other pair of elements, which means that their propagators, if they are defined at all, should be identical. This is where the approach used in this paper comes to rescue: we have a different parallel universes in each of which causal relations are given. Our only task is to determine which of these universes is allowed and which is forbidden. As we perform our task of determining that we, of course, will have to compute some version of propagators for fields. But we wouldn’t have trouble doing that since for each particular universe we are testing, we will simply assume the causal relations of that particular universe. In other words, gravitational field will be a causal set version of relativistically covariant beable talked about previously in much the same way other fields are such. We then have to introduce fluctuations away from this beable configuration everywhere outside a given set of fixed points and their -shaddow.

We now have to adapt what was done before to causal set approach. First of all, when we were defining shaddow we were using Lorentzian distance. So we have to define the Lorentzian distance for causal set. We simply adapt a definition of Lorentzian distance given in [11] and [12]. First consider a flat Minkowski space and two timelike-separated points in that space. We can rotate coordinate system to make sure that these two points are lying on -axis, with coordinates and . If we have an arbitrary future-directed curve that connects them, then its length is

(27) |

The second equal sign in above equation is based on the assumption that the curve is future-directed. Thus, while it is not true that the length of every single curve that connects and is less than the Lorentzian distance between them, it is true about future directed curves. For example, if we didn’t impose a constraint that a curve is future directed, then by going a million light years to the future and back we would of ended up with curve whose length is arbitrary large, but if we limit a curve to be future-directed, this is not possible.

Now, in [11] and [12] they simply copied the above statement for the discrete case of causal sets. In this case, they can replace a future-directed curve with a future-directed set of points, . Selecting the longest possible curve can be replaced by selecting a chain of points that has largest possible number of points. Apart from the above, there is a side benefit to this: they are automatically assured that that chain of points does, in fact, approximate a curve since by making sure that it has largest possible number of points they have also made sure that they haven’t ”skipped” over any points, which means that the points are spaced as densely as possible.

Numerical studies were done in [11] and [12] where it was confirmed that in case of Poisson distribution of points on a Lorentzian manifold there is, in fact, a close correlation between the Lorentzian distance between the two events and the length of the leongest chain of points that connects them. Thus, from now on I will simply assume that for a causal set a Lorentzian distance is defined in the following way:

(28) |

DEFINITION: Let be a Lorentzian manifold, and let be some set of events on that manifold. The past shaddow of of order is defined to be

(29) |

Likewise, a future shaddow of of order is defined to be

(30) |

The definition of ”shifted” shaddow can likewise be translated into a causal set context:

DEFINITION: Let T be a set of points and let U be some set. We say that U is ”shifted past -shadow” of T of tolerance if the following is true:

a)Any element of is to the future of at least one element of but is not to the past of any of the elements of the above.

b)If there is a sequence of points where and , then we have where stands for shortest possible discritized distance in a causal set (i.e. a Lorentzian distance between any points for which there is no satisfying )

c)Suppose we have a sequence of points where is an element of and is an element of . Then either for some k or else there exist satisfying for some k.

We can likewise define a shifted future shaddow:

DEFINITION: Let T be a set of points and let U be some set. We say that U is ”shifted future -shadow” of T of tolerance if the following is true:

a)Any element of is to the past of at least one element of but is not to the future of any of the elements of the above.

b)If is any sequence of points satisfying and , then we have where stands for shortest possible discritized Lorentzian distance of a causal set and is some small constant

c)Suppose we have a sequence of points where is an element of T and is an element of . Then either for some k or else there exist satisfying for some k.

We have explicitly written down shadows as dependent on causal relation since the latter is viewed as gravitational field, thus the fact that we use gravitational field in defining what we mean by a shadow is crucial in that we are using those same shaddows in evaluation whether or not a given gravitational history is allowed or forbidden.

The other thing we have to define is restrictions of the fields to and its shaddow. The difficutly is that vector, spinor and gravitational fields are defined in terms of pairs of points rather than single points. We will do that as follows:

DEFINITION: Let be some subset of . If is some causal relation on , then is a set of all causal relations such that for all , and for any point on a manifold, regardless whether is an element of or not.

DEFINITION: Let be some subset of . If is some real valued function on the set of pairs of points of , then is a set of all other functions such that for all , and for any point on a manifold, regardless whether is an element of or not.

THEOREM: If then

PROOF: If then for any , . This is equivalent to which by definition means .

THEOREM: If then

PROOF: Suppose . Then for any , for all . But since we also know that for any , for all . Thus, putting these two together we get for all q. Thus we have shown that which means whenever . But from the previous theorem we know that since we have . Thus, we can rearrange and in the previous statement and say . Putting these results together, we have

Thus, a causal set version of A will be

For bosonic case, we have

(31) |

Now if we include fermions we will get

(32) |

Here it is understood that while formally depends on all fields, it actually only depends on the ones that are part of the definition of fermion. It is simply that since both fermionic and bosonic fields are now defined in terms of functions on both and I formally written that depends on all of them in order to save space.

We can likewise define a ”discritized shaddow” on a causal set. Of course, causal set is discrete to start with. So we will replace the word ”discritized” with the word ”coarse grained”, where ”coarse graining” is a well known operation on causal set that involves selection of some subset of points that is ”dense” enough to approximate any other point in a causal set.

DEFINITION: Let be a subset of a causal set . Its past -subset is a union of its past -shaddows corresponding to all

DEFINITION: Let , , and be some subsets of a causal set , and suppose . We say that is coarse-grained past -shadow of and is coarse-grained past -lattice based on if the following statements are true:

a)Every element of is causally after at least one element of the - past shaddow of and causally before at least one element of

b)Every single element of past -shaddow of is to the past of at least one element of . Likewise, every single element of is to the future of at least one element of

c)If the number of points contained in past -submanifold of is , then the number of elements of is less than and greater than

d)If and are two elements of and the number of elements of a causal set that are to the future of and to the past of is greater than (this places lower bound on volume of space bounded by the two lightcones and hence the Lorentzian distance) then there are at least points that are to the future of and to the past of (where my definition of past and future includes the fact that they are timelike-separated from and )

e)If then there exist that is to the causal past of such that and are connected by at least one future directed chain of points that has less than points (i.e. an upper bound on Lorentzian distance) but at the same time we have MORE than points that are part of the shaddow of and are both to the causal future of and to the causal past of . This criteria assures that the reference frame defined by discritized shaddow approximates the one defined by actual shaddow and rules out the issues of lightcone singularities.

We will likewise introduce a definition of ”future” discritized shaddow:

DEFINITION: Let be a subset of a causal set . Its future -subset is a union of its future -shaddows corresponding to all

DEFINITION: Let , , and be some subsets of a causal set , and suppose . We say that is coarse-grained future -shadow of and is coarse-grained future -lattice based on if the following statements are true:

a)Every element of is causally after at least one element of the - future shaddow of and causally before at least one element of

b)Every single element of future -shaddow of is to the future of at least one element of . Likewise, every single element of is to the past of at least one element of

c)If the number of points contained in future -submanifold of is , then the number of elements of is less than and greater than

d)If and are two elements of and the number of elements of a causal set that are to the past of and to the future of is greater than (this places lower bound on volume of space bounded by the two lightcones and hence the Lorentzian distance) then there are at least points that are to the past of and to the future of (where my definition of past and future includes the fact that they are timelike-separated from and )

e)If then there exist that is to the causal future of such that and are connected by at least one future directed chain of points that has less than points (i.e. an upper bound on Lorentzian distance) but at the same time we have MORE than points that are part of the shaddow of and are both to the causal past of and to the causal future of . This criteria assures that the reference frame defined by discritized shaddow approximates the one defined by actual shaddow and rules out the issues of lightcone singularities.

Now we do the same tricks to introduce versions a, b and c of constraint 1 for causal sets, which we will denote as 1* to distinguish it from manifold case:

CONSTRAINT 1* VERSION A: Let be causal set. If we define measure to be integer-based, defined in terms of simple count of relevent multiplets of points, then the following statement is true: For every number , if we have points ,

(33) |

CONSTRAINT 1* VERSION B: Let be causal set. If we define measure to be integer-based, defined in terms of simple count of relevent multiplets of points, then the following statement is true: For every number , if we have points ,

(34) |

CONSTRAINT 1* VERSION C: Let be causal set. If we define measure to be integer-based, defined in terms of simple count of relevent multiplets of points, then the following statement is true: For every number , if we have points ,