CAUTION: this post should only be read by those with a deep and abiding interest in the minutiae of obtaining data, which may only be me. Most of it is common sense delivered in an unnecessarily verbose manner. You have been warned.
I want to describe a failed effort because it illustrates a particular sort of error that has come up before in this thread and elsewhere. While I'm using one of my methods as an example, some of the points apply equally well to smearograms.
This was my second and recent attempt to get vertical-only data from the NW roofline using a method that is not best suited for this task. The first attempt failed because the chosen threshold values didn't correspond well once descent began so the acquired point drifted a couple of pixels from the roofline into the sky. It worked very well for the horizontal data, which matched the NIST data more closely than I would have guessed. There are specific reasons why the horizontal was good and these verticals were bad. Knowing why helps instill confidence that this process is not some mumbo-jumbo that is randomly correct from time to time, and leads to correct data in the future.
In this last run, the problem is
horizontal motion affecting the vertical measurement on the roofline edge. The problem occurs when a measurement is being made in one dimension and motion in the other dimension bleeds over. Generally, the (x,y) coordinates of an image don't correspond to real (x,y, or z) coordinates and orthogonality in real 3-space is not preserved in the image plane. There will be at most one image axis that can be aligned with a real axis, and apparent motion in the other image axis will be a combination of motion in two or even all three real axes. In a more technical sense, an image represents the output of a many-to-one mapping from R3 to I2 such that orthogonal vectors in the real scene correspond to non-orthogonal vectors in the image plane. There's a simple name for this type of mapping, but it eludes me. Well, it's a
projection, in any case.
In the WTC7 videos, true vertical aligns very well with image vertical, a fortunate condition when combined with a relatively small optical axis elevation angle. It means that, at least for moderate displacements, approximate vertical motion can be derived from pixel motion by application of a multiplicative scale factor. At the NW corner roofline, true horizontal comes through in the image as a small but not insignificant angle:

Below, I've exaggerated the vertical by 5x and drawn a straight green line to represent the average apparent slope of the roofline. Yellow lines are drawn vertically to indicate where a couple of one-pixel smearograms or a 2D range of pixels could slice through this area:

The magenta points of intersection represent vertical locations obtained via smearogram and they're obviously at different pixel
y values for the intact, static building. Since only differences are used this is OK and, if the motion is strictly vertical, it will be OK all the way. However, if there's any horizontal motion, the points of intersection will change, registering higher or lower values depending on whether the motion is to the left (shown below) or right. Yes, I cheated and moved the lines instead of the image:

The error can be of the order of the signal if there's more horizontal motion than vertical, and all error if only horizontal. A horizontal measurement, however, is not affected if the building drops a little:
(A slightly different presentation to avoid collinearity)This is why the horizontal data would've been good even if the building dropped or sagged a little (it did in the first milliseconds of descent but I haven't validated that data - it doesn't match NIST - and there are other reasons it could be bad). Not to mention the corner edge is a nice dark line and the roofline edge is ragged and riddled with compression artifact.
A 2D method is likewise affected by this no matter if it's tracking an edge which cuts through the region or a fully enclosed feature. The edge gets shifted up and down as the feature moves side to side, but it's also exacerbated by any non-uniformities in the edge, something that wouldn't affect an enclosed feature. This is what bit me on this last run.
To get verification, I always output an animation from the extraction showing the placement of the point on the feature through the frames examined. If my eyes don't agree, the data is discarded. If I can't do any better visually than the output, I accept it to within the accuracy of eyes - a couple of pixels. So, if I state an error of +/-2 pixels, it's safe but the data could be accurate to +/- 0.1 pixels in actuality. Failures
always have a reason, whether I can be bothered to trace it depends on whether it will be informative to find out why. Sometimes it's obvious as it is here, the first 30 frames:

The horizontal wandering causes artifical vertical motion to be registered.
Real horizontal motion isn't the primary source of error here but undoubtedly does contribute. Here,
virtual horizontal motion is induced by fluctuations in the video image, making different locations of the roofline more strongly associated with the threshold criteria, so the point wanders horizontally along the roofline. Observe that the point traverses a path largely confined to the apparent slope of the roofline. The first second of vertical data looks like this:

The peak to peak motion is 0.6px ~ 4 inches, but I know some of it is bad and I don't know how much. Astute readers will likely also observe that a snap upward of 4 inches in two frames is mighty unlikely. The remedy is the subject of another rambling post.