Barry Salt: Films and Graphs

It would be nice if 'dramatic tension' could be quantified, so that we could get on with the important job of analysing its relation to all the visible features of movies. It IS something that is conceivably possible, but far away at present. In the meantime, we are just messing about with what little can be got from measuring the shot lengths in films. Nowadays, the only measurement of shot lengths that matters is in hours, minutes, seconds and frames. In the Cinemetrics system, the number of frames is reduced to a duration measured in tenths of a second, but this involves an irregular rounding up and down process from the original 24 frames per second at which motion pictures are still shot.

Mostly this does not matter, but sometimes it is better to return to using the original measurement of shot lengths in terms of their length in film frames. There are also some minor problems in measuring the actual length of a shot which depend on the type of transition used by the film-makers between one shot and the next. The majority of shot transitions are straight cuts, which are unproblematic. But

Using the Lines

So having measured the lengths of the shots in a film, they will be represented by a string of numbers in the first place. One can look along this string of numbers and spot any interesting regularities, but it sometimes is easier to see this from a graph in which the lengths of vertical lines are proportional to the size of the shot lengths. It is also easier to get the

As always, what is of most interest is the relation between the content of the film and its form. Thirty years ago I started investigating how the cutting rate within a film scene varied depending on the type of action within it. This is quite easy to show, and does not require any mathematics beyond counting and averaging. You can see it in all editions of my Film Style and Technology:

The other thing one can study is the shot length frequency distribution for a film. The most basic way of doing this requires no more mathematics than simply the counting of the number of shots in the film whose lengths fall within a series of fixed intervals -- say from zero to one second, greater than one second to two seconds, and so on -- and then drawing a bar chart

Figure 1

It so happens that a class interval of one second works quite well for films with an ASL in the region from about 5 seconds to about 12 seconds, which is where the vast majority of films made from the 'thirties into the 'fifties dwell. So for Casablanca I used a class interval of one second.

The first row interval, numbered 1, has the total number of shots (which is 23), with lengths between zero seconds and up to and including 1 second. That is, it includes lengths 0.1 seconds, 0.2 seconds, 0.3 seconds, and so on. The second bin or class interval contains a count of the number of shots greater than one second, and up to and including shots with exactly two seconds length. That

Figure 2

(The 'More' column at the right end of the graph represents the total number of shots with lengths greater than 50 seconds.)

This graph is a bit jagged and lumpy, and we can smooth this out by using a class interval of width two seconds, as in the following graph.

Figure 3

For films with a really short ASL, of the kind that have emerged in recent decades, things are slightly different. Take Shoot 'em up (2007), with an ASL of 1.64 seconds.

Figure 4

This distribution has a smooth profile, but it is not very informative, with 1484 shots, almost half the total, in the first interval containing those shots between 0 and 1 second. If we change the time measurement from seconds to frames, and makes the class interval 8 frames (or a third of a second) wide, it looks like this:

Figure 5

As well as providing more precise information about how many shots there are with each length, the shape, though equally smooth, shows the dive from the mode at 16 frames to the origin, which is characteristic of film shot length frequency distributions. Now if we go for broke, and decrease the class interval to one frame, we get:

Figure 6

This graph has a much more jagged shape than the previous one, but it also shows what was not visible before, which is the way the distribution dives towards the origin. This can be made clearer with an enlarged view of the beginning of this graph from zero up to nine frames.

Figure 7

I have added the theoretical values derived from the Lognormal distribution that best fits the actual distribution of shot lengths for this film. Although the fit is not perfect in this region, you can see the way the actual values show the same sort of approach to the origin as the theoretical curve. The way the curve approaches the origin asymptotically is particularly characteristic of the Lognormal distribution, and it

If one wants to compare the shape of two distributions with closely similar ASLs, one can interleave them on the graph, as in this comparison of Shoot 'em up and Derailed (2002).

Figure 8

The resemblance is very close, which is not surprising since the median for both distributions is 1.04 seconds, though there is a small difference in their ASLs (1.59 seconds for Derailed, and 1.64 seconds for Shoot 'em up.)

To actually measure the difference between the two distributions, we can get the Pearson correlation coefficient, which is 0.992, and indicates the closeness of the two distributions.

Going Slower

Looking at the slow end of film cutting rates, this is what we get from graphing Sunset Blvd., which has an ASL of 15.5 seconds, with a one second class interval.

Figure 9

This has a pretty rugged outline, which is not too surprising, given the low number of shots in each interval. So let us try a class interval of two seconds.

Figure 10

That is a little bit better, but more importantly it is starting to get the characteristic shape we expect from film shot length distributions. So let's try an interval of 4 seconds.

Figure 11

I like that shape or profile; it looks like Derailed or Shoot 'em up in their 8 frame class interval incarnation, and also the two second interval graph of Casablanca.

Moving upwards, Panic in the Streets (1950) has an ASL of 24.3 seconds, and in this case its distribution shown with a one second class interval looks like this:

Figure 12

This is a lot more jagged a shape than that for Sunset Blvd., and an obvious idea is to see if the isolated peaks at 17 to 18 seconds, and 34 seconds have any significance. An examination of how the shots of these lengths occur in the film with respect to what is going on in them shows no relation to the lengths of the lines of dialogue being spoken in them,

Even when the class intervals are widened to 4 seconds, the distribution does not smooth out that much, unlike Sunset Blvd.:

Figure 13

In other words, as we get towards, and past an ASL of 20 seconds, we are in a region where any close resemblance to any standard probability distribution is vanishing, as I have often said. To rub this point in, I recall the distribution for la Signora senza camelie (1953), which has an ASL of 59.4 seconds, and is shown here with a class interval of ten seconds:

Figure 14

Even with such a large interval, the usual characteristic shape is only vaguely there, and the distribution is quite jagged.

Looking For Significance

My search for a reason for the excessive number of shots with lengths of 17 to 18 seconds in Panic in the Streets failed, but one failure is not enough to stop looking for reasons for such peaks in shot length distributions. Perhaps in musicals there might be specially favoured lengths in the way that musical numbers are edited, related to the regular structure of the pieces of popular music being

My first candidate is Good News (1947). The shot length distribution for this film with 2 second class intervals is:

Figure 15

The class of shots with lengths of 39 or 40 seconds sticks out, but when looking down the table of lengths for this film, and seeing where they occur in the films, shows that they nearly all appear separately, and mostly in dialogue scenes. (I find that the best way to study this matter is to use a table of the shot lengths written down in order in a spreadsheet in

Another try with Singin' in the Rain (1952) failed in the same way, and indeed when simply playing a DVD of the film it is apparent that the editor was again not cutting the musical numbers in any rigid way at the end of each chorus, or the end of each phrase, or in even multiples of the bar length. However, a final try with Anchors Aweigh (1945) did just slightly

Figure 16

The column representing shots of length between 14 and 15 seconds looks a bit too big, and indeed from shot 421 to shot 426 there occurs the following sequence of lengths:

14.1, 13.5, 27.3, 14.2, 13.5, 14.4.

But these shots cover a conversation in the studio cafeteria between Gene Kelly and Kathryn Grayson, not a song. However, after failing to find what I was looking for in about a score of musical numbers in all three films, I got lucky with just one song in Anchors Aweigh, which is the song "The Charm of You" sung by Frank Sinatra to Pamela Britton in a Mexican-type restaurant. This runs

26.7, 6.3, 5.7, 6, 6.8, 6.6, 7.5, 6.7, 12.1, 8.6, 23.9.

The first shot covers the whole 32 bars of the first chorus of the song, then the subsequent shots go to different angles on the pair of lovers for each subsequent 8 bar line over two more choruses, and then the shots lengthen out to cover the last chorus with three shots, which depart from the 8 bar pattern. If you check with the Cutting data in the Cinemetrics database, you

It is quite possible that if one examines musical numbers in recent decades, where the kind of pop music used is very different to that of the 'forties, and the cutting is much faster, then one might find a greater regularity in the cutting of such numbers. The observations about a scene in Wedding Crashers (2005) on page 438 of the paper Attention and the Evolution of Hollywood Film, published in

The moral of my story is that you can do plenty, and see what is going on, by staying close to the data, without gussying it up with excessive manipulation.

Barry Salt, 2013