Video Cram (Compression)

-(improving on SPIHT-Haar)- we note pixel-quad x-and y-tilt tend together, resolution is order-dependent, unsent pixel- bits makes lossy-video, maybe with significant compressible redundancy (*)

[Topically related to Progressive Image Resolution; and Fully Interleaved Scanning]

* (The Haar Function was defined as one-dimensional, but we recognize the obvious two-dimensional extension, better known as the H-Transform in astronomy, but which tends to lossy by design compounding precision of quad-summations ... We also note that triangular arrayed pixels would be better suited for triads, than squares are for quads: as only the triad-average and two subdifferences would be measured, no next-order twist;- but also involve more-than-orthogonal-quadrature integer calculations….)

Considering—

The 'ideal' bandwidth-constrainted video front-end would have stacked-3-color pixels atop instantaneous-sum-and-difference transform-processing and successive-approximation (top-down) bit-slice compression-transmission, so that-- picture-motion itself would be realtime, with lossless definition....

Various approximations may suffice technological applications by quad-adjacent RGBG-pixels (or RGBY), residue-retention at the compression-transmission level, 3D-and-motion-estimation at the picture-level (top-pixel-level), pixel-compression by subdivision-partitioning (rather than omnidirectional), and, adjacent-value-prediction, etc....

CONTEMPORARY BACKGROUND: (digital image development)

HDDV-HDTV digital video has been deemed comparable to 35mm celluloid film, but suffers various fixed-grain pixel artifacting... off-registration washout adjacent to on-registration sharpness, jagged stairstepped edges, digital-stepped 'dove-walking' wherein objects are very apparently 'jogging' in pixel-size steps and 'breathing' fading in-and-out as image-pixel-registration is crossed... maximum-definition-resolution is about 1-arcmin. (which is about 1/5th² of fovea-resolution at 12-arcsec.)... temporal correlation is very poor at the pixel-level as the camera moves (which is necessary for subpixel-interleaving fill-in): it would seem better to not process for temporal correlation at the pixel-level... luminance tending multiplicative by its light source over large areas, more than additive as occurs along boundary edges, it would seem better to use a logarithmic-like intensity value of pixels, (compare also angular-lighting of round edges)... color-luminance is actually green: red and blue appear luminant because they sit partway in occular-green-receptivity... implementation suffers theory-artifacting: Wide-pixels are approximately square-functions of a scene where in natural fact atomic-point-functions would be better resolutioned by like-implementing in the camera (non multilenticular moving)....

PARTIAL REMEDIES - SESQUI-PIXEL: resolution in drawn graphics ("offset-Nyquist")

A partial remedy, is to take all images in double-resolution and compute on-registration and off-registration as alternate-images with equal resolution-- though it still results in double-vision with alternating pixel-widths, breathing, artifacts....

A better approach, slides off-registration a quarter-pixel, thickening pixel lines equally, relatively partially-filling adjacent pixels-- which as a simple 'averaging-mechanism', resolves twice-as-many thinner-than-double-width-somewhat-thicker-than-single-width pixels.

For cgi computer-generated-imagery, a different criterion improves HDDV by taking vantage of "digital smoothing,"- a specialized concept better equalizing HDDV to 35mm, its touted film-equivalent, by spreading each single pixel onto an adjacent pixel, to a "digital quarter step"; The base value of this method is, that the smoothest-moving line of constant width by adjacent-pixel amplitudes (1,x) and (x,1), is about x ~ 0.60 (*), cf 153/255 ... Its successive offset half steps appear equal, Its apparent line thickness is roughly a sesquipixel, a half more than single-pixel, but half-as jumpy or discontinuous "half-moon-jogging," and still contains a hint of fine-resolution and-motion, and not as smeared -nor 'breathing'- as alternately straddling pixels which occur as the extreme in general pixel sampling ...

* (Display Gamma adjusts this, as well as room-brightness, color sensitivities; and vertical and horizontal differ slightly by trace-overlap, and RGB/RGBG pixel placement, yet both are close about the median. Tolerance is apparently tight as unevenness is noticeable at ±10%, in either case: a third, of the web-standard six-cubed 8-bit color-scheme quantum of document-browsers.)

By comparison, On-pixel alignment exhibits alternating thickness 'breathing': where lines cross one-and-two pixels the half-bright double-wide lines single-width-equivalently bright about ~0.70, cf 179/255, and fine-detail washout.

(Appraising the two results together, pixel-system-gamma is 2.00, or that is, the original-receptor pixel-system-gamma is 0.50, square root, equivalencing pixels as independent, orthonormalized vectors:-- A "digital box" pixel, uniform, slid to the halfway position, needs 0.50 = 0.71², as in the second result; Slid to the quarter position, needs 0.25/0.75 = 0.58², as in the first result. The 2-D sesquipixel roughly equivalences to spreading each original-definition pixel to [0.75 | 0.43 | 0.43 | 0.25], added gamma-correctly to the x-,y-,xy-adjacent pixels; and thence moving fine half-steps horizontally, vertically, diagonally, by column and row alternations.)

[2012 Note: On newer flat-panel screens, 136/255 sesquipixel, 162/255 pixelwash, looks almost smoothest: gamma may be 1.0]

HYPERBOLIC SPATIAL-INTENSITY GAMMA: (despeckling the image)

  • Indexed-color images are known for speckly posterized faces:- What should, be, smooth, across single-quantum color-increments, is instead stepped jumpy, resulting in apparent speckles.... One solution is to resolve any single-quantum step between adjacent pixels as widespread-average grading: not a step but a tail crossing the dither, -adding its value over the range to the next occurrence,- ... thus entirely removing the speckle while also sloping the smooth surface to the next step, which may or not be a despeckled single-quantum. Hardware-technically, speckling is removed by least-significant-quantum LSQ smoothing: hyperbolic spatial-gamma ... double-quantum steps would also be smoothed for gamma, but their spatial spread is single-pixel-width narrow (whence the full hyperbola).
  • Webpage-image-generation software then-needs support this smoothing by avoiding representing sharp edges as single-quantum steps.


    [under reconstruction]


    Another approach, samples 4x8- or 16x16-subpixels in near-golden-ratio-interleaved order: to be displayed pointwise-subpixelwise... (4x8 uses 1,3-steps fully correlatively prime, and 16x16 uses 5,9-steps pushing nearer the middle each step; golden-ratio ensures that successive steps fill-in with the same ratio and also tend to fill nearer more-previous points sooner than the more-preceding).

    SPIHT compression vs. the new world:

    SPIHT emits zeros for x- and y-tilts until either's top bit is nonzero, but x-y angle correlates the top bit over 2/π ... it would seem better putting out a single zero for both until one flags nonzero-'begin' ...

    At the fixed-bandwidth lower limit, as for video, SPIHT suffers resolution-waffling where part of the picture has an extra bit, part does not, and part between, varies ... it would seem better time-interleaving the quad-selection processing order ...

  • SPIHT uses a list of "found" coefficients,- a rudimentary compression putting similarly compressible pixels nearer in the list ... but arithmetic coding does not need this: It needs only know the compressibility for each pixel ... thus possibly eliminating the pregrouping and listing, in favor of flagging visited coefficients in situ.
  • SPIHT-Haar computes sum-and-differentials, when that is redundant: larger differential implies larger sum and may reduce to base and differentials for fewer bits into compression;-- differences would be added-back by half to the base to estimate the average.
  • SPIHT-Haar progression is by quad-processing of summations, leaving differentials intact; but the contribution to bit-total by one is-smallish by four ... furthermore the purpose in computing the next-higher-quad is the commonality expected, especially in magnified objects ... Therefor it may be effectual to use each higher-quad differential too: to estimate its corresponding finer subdifferentials where similar or of same sign; and sum-and-difference pair-chaining is capable of full zeroing, a complete two-legged-ladder process. (NB. Higher-order-differential-chaining is not-useful because higher-order differences are short-range unbalanced-precision edges.)
  • SPIHT-Haar x- and y-tilts are cartesian coordinates, where polar would be more efficient: In particular, For any given tilt-slope, the horizontal co-sine-components follow a 'square-root-law', loading both equally to 0.71 at 45° and not separating them even a full bit until 63.4°, --whence x- and y-tilt tend together, 2:1,-- 8% regainable efficiency. (NB. In the above-suggested bit-per-polar-tilt scheme, vertical lines dominate because horizontals slant in common standing-perspective.)
  • (Contemporary SPIHT-Haar depictive documentation neglects maximal magnitudes, or scaling, in representing large subdifferences, and won't be further discussed here until that algorithm is resolved.)
  • Subvariant SPIHT-waddling alternates x-y-axes: thus does not put an xy-twist-difference in its fourth quad-position: whence it is more realistic compression: It maintains equal precision in all differences but applies more to non-square-rectangular pixelation;- In particular, the alternating coordinate 1.41-stretching also compensates somewhat the coordinate polar-tilt-codependency.
  • DEEP BACKGROUND (raster-scan video, television committy):

    NTSC-defined 4-bit 16-level monochrome reached 7-bit significance with angular density and temporal noise dither: range, amplitude, smoothing. Color used lower frequency information in two more bands, Y*-R, Y*-B orthogonal, at the expense of partial high-frequency signal in the luminance Y band. And it fit a 6MHz channel, 30 fps × 525 lines × 760 dx (B&W px; color px; vestigial sideband px; pre-PLL FM sound vx; guard band xx); Quantization resolution was less noticeably sensitive on the dark end (**).

    * (bandwidth-reduced Y-luminance)

    ** (signal amplitude inversion catches RF spikes as less-noticeable black streaks instead of white)

    High-SNR signal-noise-ratio cable, satellite, DVD, technologies have increased the potential and actual resolution tenfold, signal levels to 4-5 bits (e.g. QAM16/QAM32), pixel quantity 8× (esteemed commercial-35mm-film-equivalent, but film has its own improvements); deriving 6-7 bits from density (dither diminishes as SNR improves, and is inaccessible in most digital coding schemes * but modulation schemes utilize the noise reduction for signal-correction robustness).

    * (An exception is OQAM64/OQAM128, Offset quad-interstitially compatible to QAM16/QAM32; cutely called, OQAM's shaver.)

    But the technological shift from monotonic amplitude, analog, to digital, required revised methods of signal error detection-erasure-correction;- Especially digital signal coding required "smoothing-soothing" of code-errors that would otherwise result in irreverent, picturally unrelated temporal and spatial optical discompositions that looked more like TV-"ghosting" patching-in overriding channel discontent than TV-"snow" or motion aberrations. Simple save remedies involved stalling repeating the whole prior image or spotwise dark-outs (reduced-brightness image retention). But ideal smoothing-soothings were something like reduced-spatial-resolution "blur" and reduced-amplitude-resolution "snow";-- the blur was new and less noticeable than "snow" as its next image would restore detail. This lead to the selection of the sum-and-difference transform "blur" and bit-slicing "snow" where the channel could be bandwidth-truncated (as NTSC is bandwidth-fixed) and signal frames would each contain the most significant image-bits filled to the allotment.

    RESOLUTION:

    The nominally ideal video imagery is a faster-than-seen shower of photons averaging to the original picture scene. Television's raster-scan put up an image-average display, stiffly similar by lines alternating interleaved; and, cinematograph's shuttering put up an array of near-simultaneous flashes; both with noticeable flicker, despite television's energy retention at individual pixels (that gave vidicon cameras streaking). An ideal digital-time image would put up pixels in a "pseudorandom" pattern; however, disparate pixel-addressing has little correlation among adjacent samples,- which correlation would be used arithmetically, simply to estimate neighbor pixels, non adjacent when jumped. Consequently the advancing television technology reverted to cinematograph-like framed "progressive scan",-- though computationally only a large frame-memory is needed to nearly reproduce one from the other.

    Ideally also, photons are not pointwise bunched but faster quantum-refreshed, allowing for 'catching' flicker on the periphery.

    (Nevertheless, Because the usual image viewing brightness photon shower is dense and rapid, pseudorandom works equally well on small scale, spotwise, as for whole images: An equivalent might then be a prime-ratio interleaving raster-scan in pixel groups, approximating golden-ratio area-fill, e.g. 7x5-steps in 16x16-blocks ... retaining some local correlation, a few levels up, and timewise;- and might thus also adapt high-resolution to lower-bandwidth subsampling and non-microlensed pixelation, camera and, receiver: present possibility.)

    The next-major application of image resolution is in third-dimensional travel, into the image, as with computer-generated imagery; and gave rise to the Haar approach (Haar Transform, useful as an approach for characterizing common image-source business): Consider a single pixel of given luminance: Travel into its depth requires resolving its subpixels. Haar wavelets do this, appending subdifferences Δx,Δy tilts and ΔΔxy saddle-twist, and third-dimension Δt and compound double and triple subdifferences, for motion compression.

    Haar is used in astronomy where telescope lens and receptor systems have equal resolution, adjacent pixels are optically matched and usually spanned by single stars; But other applications, especially computer-generated imagery, e.g. text/html document forms, where images are registered to pixel lines, should better differentiate subpixels directly by the smaller subpixel value and reconstruct the larger remainder ... we'll designate this, Haar-0 (zero), but a later scheme, switched compression, shows these are virtually the same.

    COMPRESSION:

    Localized video-differentiation is highly efficient because objects have local features, but integration on partial-data and broadcast-noise is unsatisfactory, needs be locally restabilized to keep integration on-path, and is essentially two-dimensional,- which last means it must go right to code. Most improvement schemes do progressive imaging lowpass "subbanding" followed by highpass detailing. Early schemes tried to be NTSC-compatible by stuffing digital data into the lower-order bits as analog-tending-digital-codes, "gray snow", half-quantum-offset keying doubling NTSC spec. 16-level ... tending invisible to analog. In full digital non analog-compatible, the Haar transform is progressive on closed pixels, -replacing full summation computations with over-pixel differentials till all coefficients are differentials, and so pyramidally restabilizing each sub-quad ... Haar remains accurate to register adjacent edges between quads while SPIHT performs top-down successive-approximation.

    THE HAAR TRANSFORM: (2-D spatial, still images)

    The progressive effect (video):

    * (SPIHT is usually described as starting at much deeper detail than just one top pixel.)

    DIGITAL PRECISION, REPRESENTATION: 1-D CASE:

    Sum and difference increase the resulting representional precision of data necessary to recover the original, by an additional bit per; but only one of the two one-bits is needed as odd(a+b) = odd(a-b) is redundant information, conveniently in the difference coefficient, truncating the sum to average, at the original precision, so that progression likewise stays at the original precision, up the pyramid. At reconstruction, the odd(a+b) bit is recovered from odd(a-b). Haar leaves a trail of differences increased by one bit, unto the top average; and recovery is progressive and successive to lossless.

    DIGITAL PRECISION, REPRESENTATION: 2-D CASE:

    Four-way sum-and-differentials increase the resulting representional precision by two additional bits per but can be truncated a bit: H-transform was defined without truncation, but all four LSB's are equal, and (a+b+c+d) = (a+b-c-d) + (a-b+c-d) - (a-b-c+d) mod 4 as -3d = d mod 4 ; and so also the 2-D Haar sum-average can be maintained constant precision up its pyramid: p-total 2-bit-truncated at the original precision, p-tilts at one bit more, and p-twist holds the full two-bits-more precision; Reconstruction simplifies to p-total-LSB#1 = oddsum(p-tilts-LSB#1's, p-twist-LSB#1). Haar leaves a slightly uneven field of differences increased by one bit, a third by two bits, unto the top truncated-average; and recovery is progressive and successive to lossless. (*)

    * (It is not certain in original on-web documentations, that SPIHT keeps precision this tight.)

    From there, basic SPIHT compresses the coefficients.

    REDUCTION:

    Data efficiency comes about by reducing the entropic representation, the average number of bits identifying the number. In binary resolving pixels, When subpixels are very different, the Haar-0 minimum, subpixel, suffices, But for very similar, the Haar average suffices: each more particular to a kind of image, and if that can be determined, the compression can be selectively better. In quad resolving pixels, also, p-total average can be replaced by q-minimum: p-average recoverable by summations with p-tilts and p-twist.

    [under reconstruction]

    In binary, The more efficient Haar-0 subdivides a pixel-average by its subpixel-minimum ...

    But because both Haar and Haar-0 resolve by progressively subdividing pixels by subcoefficients requiring four more bits per quad, the transition efficiency equator-crossing occurs exactly halfway where subdifference and subpixel amplitudes equal the pixel-average ... the middle half range (0.50 to 1.50) is most efficiently compressed by Haar, and the two outer quarter ranges (0.00 to 0.50; 1.50 to 2.00) by Haar-0.

    And we might switch-between Haar and Haar-0 subcoefficients: a switch becomes useful when the image tends to extremes rather than middle coefficient values, as in the case of combined natural and artificial images typical in the modern Internet video information era (photographed objects and diagrammed constructions; real and virtual),- a choice favoring shallow slopes, or, sharp edges, low and high contrasts, over middling slow transitions which are not half as common in documents, astronomy, nor objects except at rolled edges.

    But a switch bit shifts the transition efficiency equator-crossing to occur, above, the halfway mark,- toward the two-thirds depending on the entropic information content of the zero-bit, which, depends on the image (It is possible to partition the whole picture coarsely and switch only on partitions, costing each pixel a fraction of a bit); And we should saddle it on the lesser likely compression method.

    Nevertheless, the highest-possible top-significant subcoeffient bit, one bit above the pixel-average top-significant bit, indicating the subcoeffient may-reach double the pixel-average, suffices as the switch bit, so that when a subcoefficient is out of optimal range, the higher value range is automatically the other type coefficient (*). (It would be in statistically high use itself in the Haar, were steep slopes common.) The Haar is more usably the prominent subcoefficient, as Haar-0 applies to high contrast which tends to take only its larger values (black-on-white, graphite-on-vellum, rather than dim-on-light gray, triple-on-unit dark); Also, in top-down bit-rastering, the top-significant bit already appears most orderly in Haar,- and, that facilitates progressive partial-decoding (called, "embedding"); (and may be best if we keep the transition at halfway).

    * (For astronomy, a switched Haar/0 using the top-significant bit as the switch bit incurs no statistical loss, because all the values get used, whichever values get used first.)

    However, it is now usefully apparent that the switch-selected smaller half Haar-0 is virtually identical to the unused larger half Haar by mere magnitude complementation (pixel-relative remainder) and doubling its magnitude by including its additional bit of arithmetic resolution; the Haar-difference sign is equivalent to the Haar-0 subchoice (by judicious pointer-sense selection); ... which in practical implementation means a, Justified-Haar (*), is the Haar with its upper-range subdifference bits below its top bit and sign, amplitude-reordered as a priority prediction: a reversal of its large-small ordering of amplitude graduations above halfway that presumes large-excursion adjacent subpixels are tending to higher contrast. (Justified, meaning, pulled tight to both limits,- of contrast.)

    * (If you know the SPIHT bit-rastering algorithm already, the Justified-Haar must notify SPIHT of its top bit and sign as soon as it reaches its level, then later its choice of switch to magnitude-complementation, but which is, now, that top bit; and thereafter midhigh bits are mostly zeros when reordered, to be compressed by yet an intermediate SPIHT listing also leisurely spilling into the standard coefficients listing ... more discussion will ensue momentarily.)

    (Magnitude-complementation of integer subdifference X across average Y, is simply 2Y+1-X -or- shiftup Y + 2s'complement X).

    Now, implementation of a, "squeakey-clean Justified Haar," has the encoder checking the magnitude of the Haar subdifference against four-thirds of the Haar pixel-average (or three-quarters of the subdifference against the average), and when exceeding, jumps up and turns-on the subdifference's top-possible bit and fills with the magnitude-complement of the remaining amplitude bits --except that-- it also crops sooner when the Haar subdifference exceeds to the next bit by itself (squeaky-clean getting 87% numeric possibilities maximally justified; slightly less than 100% fully justified; a Nyquist-like corner). Nevertheless, the encoder needs only abide within its cropping rules, for the decoder already properly computes whatever it's given.

    In total, the same amount of information per numeric coefficient ... just changed its preferred meaning along the way.

    (Justified Haar also solves Haar's loss-of-compression-stroke problem in medium and high contrast limitary cases: middling values are already telling the compressor, the next-adjacent compression is something else as this is going for the limit in the pixel-step between; while higher values already disconnecting from slope compression are maximally effectually compressed.)

    --§ THEORY SPLIT §--

    A "Theory Split" occurs here as it is now useful to separate two tracks of development similar but distinct for efficiency; It means the metatheory or principle concept, implementable efficiencies, are benefitted by choosing narrower discussion of general, theory-elements (a properties-selection, a "design tuning;" cf a theory split occurred in long-division on-the-left vs. on-the-right for prime factor checking). We begin by cross-referring, till the discussion track gets dense:

    THE UN-HAAR TRANSFORM --almost Haar but we're now looking at its right-hand efficiency:-- While Haar and its extensions as herein described are ideal for progressive resolution, video compression would rather have the maximal edge-efficiency attainable:

    To wit--

    Because the subdifference requires the pixel-average to be of sufficient amplitude to not undershoot, a large subdifference, as in the high contrast case, restricts the minimum value of that pixel-average, even defines it, and we can take the reduction. (Cf progressive, Haar, where the pixel-average precedes its subdifferences in processing, and we will take the reduction in the subdifference instead).

    1. In the initial bottom-up computation of Haar average-and-difference, the difference magnitude-only, subtracted from the average, suppresses the average to no less than zero;-- but which is just the pixel-minimum: and the difference-sign says, which pixel.

    2. But, even more-different from Haar: The suppressed average of any one pair is not necessarily like that in the adjacent pair, -its difference is whatever it takes-on (especially unlike in the high contrast case),- but in fact, the full height, is like: That is to say, the edge value -alone, unaveraged- is very likely like the adjacent average ... whence the computation for the next higher level average-and-difference of lower averages in, Suppressive Un-Haar, includes the lower subdifferences to compute their edge values (but sends the suppressed average for maximum efficiency);- slightly more computation but to get at the compressive efficiency (And different from the present definition of SPIHT by other authors).

    This differs from Progressive Haar because a pixel-difference processed on a Suppressive Un-Haar pixel-average, yields not two sub-pixel-averages to-be-resolved further, but a pair of yet-to-be-fully-defined sub-pixel-amplitudes which may be averages or the nearest edge, depending on the subpixel-difference being larger than half -[rechecking arithmetic]-... [under construction]

    REFERENCE: (differlets and data reduction; aka steplets, wavelets, sloplets, tiltlets)

    Basically SPIHT [*] Set Partitioning Into Hierarchical Trees, is an efficient coding routine of bitslices of list-sorted coefficients designed for signal-compression of transformed data having a preponderance of near-zero coefficients ... however, SPIHT video, is usually defined for the Haar Transform as that is facile in computational encoding and decoding, essentially lossless and compatible with low-loss modes (Low-loss provides efficient results suitable for an HDTV option). Haar gained fame in astronomical telescopy imaging having sparse and pointillated singular stars straddling four pixels at practical optical resolution (except planets, novae, nebulae, galaxies; computer-generated-images).

    The Haar Transform is progressive binary subdetailing by differentiation: 2-D for imagery, though 3-D for object-stereoscopy may find use, and also used frame-to-frame 3.5-D, which is most effectual on very fast -continuum- cameras, or very slow scenes; Haar is ideal for database-movie interactive-representation of objects, but usually implemented on diced frames (*) subordinate to an initial "thumbnail sketch" miniature. In video applications, coarse details are needed more rapidly more often and fine details more slowly,- but that is essentially motion-detection, a continuum camera, or infinite resolution, a subsampling camera. For image-generation, Haar is computed top-down as successive differentiation; for image-compression, it is computed bottom-up.

    * (For image compression the transform needs only be applied at the first few levels (8×8 in 2-D) as the bit-density fraction of each larger sum-average coefficient, included with even tiny but-nonzero nodes, becomes nilficient.)

    Haar coefficients are all-but-one differences of adjacent pixel-amplitudes-and-averages; and one, sum-total-pixel-average atop:- At the first level, pairs of pixels are averaged (sum÷2) and sloped (differenced); At the second level, pairs of the averages are taken as larger pixels, and averaged-and-sloped ... the total computation is 2(n-1) for n initial pixels (typically a power-of-2);- simplicity that was fine for level backgrounds (astronomic black), but suffers on gradual slopes where the small slopes are all coded, uncompressed.

    (This is a case example of basic computator elements: add-shift/subtract, as a functionally optimized process unit, hardware.)

    Haar was great improvement over the Hadamard transform in compression efficacy and computational scale efficiency:-- Hadamard coefficients [**] like Haar at the first level, also computed averages-and-differences of the slopes,-- but, though in potential cases might improve compression, did not often for practical imagery, and its computational cost was n(log2 n), (where 2Mpx is 21-level).

    Minus, Hadamard's compression of slope coefficients was ignorant (independent) of the obvious information available in the very-next processing of the averages: the slope of averages, in the next-level, was computed and rendered separate from the average of the slopes, in the lower level, --unused though containing a near-preeminent estimate, differenced only by curvature.... Haar, didn't go far enough to use it, and Hadamard ignored it; yet its inclusion completely extracts the slope, as Haar did for the averages ... we therefor remedy this by declaring and defining--

    THE HAARD TRANSFORM: (Haar-D, Ha'ard, Haar'd, Haar-D-Haar)

    Coining the name to indicate the finish to the Haar (an extra Difference) and the removal of the separation (dam) in the Ha(dam)ard ... Haard slopes are subtracted down one level: the slope of the averages from the average of the slopes. The computational cost is ≈3n, barely more than Haar ≈2n; and makes possible reduction of slope-residuals to zero on long gradual slopes where slope information is most redundant, --though average-slope is less widely effectual than pixel-average because slope reaches upper and lower crop limits. Also Haard directly defines line-doubling as an intermediate process during successive differentiation and top-down reconstruction, smoothing the large pixel-averages digitized stairsteps, before adding residuals.

    Averaging-up the slope adds no amplitude (no bits necessary for lossless reconstruction), however, luminance sensitivity favors its subsequent difference, and therefor the data compression scheme will need to look one bit deeper in that coefficient. The slope of adjacent averages is double-amplitude of the average of adjacent slopes,--

    ((x1-x2)+(x3-x4)/2) - ((x1+x2/4)-(x3+x4/4)) = (1/4)x1-(3/4)x2+(3/4)x3-(1/4)x4
    --which is just the Newton-weighting to a curve-fit-approximation (1,3,3,1). This is also intuitive as the average slope needs its corresponding order bit at the same encode-decode-time as the slope of adjacent averages already double-amplitude;-- cf at the coarsest resolution, the detail-advance in adding a next-level of interstitial pixels should result in increased detail immediately: a stairstep at a given resolution should not remain a stairstep at the next, if it is-not.

    The Haard Transform is equally progressive, as successively extensible, as the Haar; its advantage is only to reduce two difference coefficients, both by average and difference and one by prediction. Further improvement may include the sum, of average and slope (3,1,1,3), making the higher levels smoother than polygonal. However, this particular sum affects the higher bitslice early;- and from the top-down constructive perspective where Haar resolved successive subpixels, Haard resolves successive subdifferences as well, and thence its two succession paths, the averages and differences paths, ladder-parallel,- and further process improvement involves and entangles that runging, and for decreasingly significant compression gain.

    [under reconstruction] And a statistical item to consider: The slope-of-the-averages can be used half-efficiently, more or less, in predictive analysis, as it is less likely maintained across much of the image, especially where it is large, or either average, extreme, and the subdetail average-of-slopes more likely to roll-off, even cap-off ... which reduces the compression a bit less, rather than fully, but statistically better yet.

    THE SPIHT-HAAR:

    The essence of basic SPIHT, is: * (Above a median total resolution, the optical resolution efficiency of each pixel decreases about a bit-per; which is also true at about the DC-level, the total brightness, is not very significant)

    Full SPIHT-video, is additionally:

    SPIHT-HAAR EFFICIENCIES:

    Though SPIHT is efficient, its bandwidth depends on brightness, which for video is of no consequence as the video channel is designed by regulatory convention (Internet-computer imaging does take advantage of minimized transmission and storage);-- or if the camera has further luminance detail, SPIHT lossless-successive approximation can bring it up. Alternatively, video contrast in dimmer areas can be raised by preemphasis, e.g. taking the logarithm of pixel amplitude (except black-zero defaults), and transforming that ... in fact, logarithm is somewhat more appropriate than straight, because it represents the actual reflected or transmitted optical shade-proportionality under any lighting: whence no detail is lost at lower lighting within the ability of the camera: an object at hundredth lighting is coded nearly as efficiently as full lighting; brightness can be turned-up at the receiver with no apparent fading (cf night-vision) ... the disadvantages are: less stability of slant-linearity along edges with coarser quantization at brighter amplitudes; weaker compression at dimmer, as almost all values have a higher bit turned on, indicating amplitude "range", and, the algorithm complexity to ensure dimmer samples are not overly resolved; and loss of linearity approximation on large-excursion slopes due to light-source size, e.g. sun, whose slopes are very linear, less accurately compressed by the logarithmic "gamma" response ... rather, the logarithm works better for the time-domain, as source-lights for a given scene vary over time, and less for the spatial domain. (*)

    * (The slope for the sun, or any disk or round hole, as a light source, is not linear; nor logarithmic; but typical edge-slopes occur at junctures of multiple obscurations, e.g. leaves on a tree, where intersection of obscuration angles are, fairly linearly.)

    SPIHT-Haar has intrinsic inefficiency on smooth increasing across small regions larger than adjacent pixels,- which are expectably predominant in daylight scenes: SPIHT is hard-programmed to compute and keep the smallest quad slopes, and codes them all rather than include larger quad trend coefficients ... in other words: SPIHT might be improved with haardlets,- or, by decidable-haardlets where that efficiency wanes: requiring an additional bit of information at each coefficient, whether it is a finished Haarwavelet, or a compounded haardlet in the next higher quad ... however that bit contains significant information, that for sloplets s1,s2, taking their sum and difference, improves the data-bit-compression, or not, over the two individually:

    [under reconstruction]

    BIT EFFICIENCY: The number of bits representing (the significance of) a number n≥0 (integer) is, (log2 n+1).

    The average number of bits representing a range of numbers [0,N], N≥n≥0, is ∑(log2 n+1)/N+1 = (log2 N+1!)/N+1. Approximating n! ~ (n+.5/e)(n+.5)√2π, this is about (N+1.5)(log2 N+1.5/e) + (log2√2π) / N+1, or roughly ~ (log2 N+1) - (log2 e) + trim, that cannot exceed that for its own largest value: thus, just over (log2 N+1) - 1.44 bit (a natural bit [nit] less than for its largest value N).

    * (Cf typicals, for smallish N: 1.0-.50 bit, 2.0-.85 bits, 3.0-1.09, 4.0-1.23 [NTSC], 5.0-1.32; 6.0-1.38; 7.0-1.40; 8.0-1.42 bits ...).

    The average number of bits in representing the sum, of two numbers N≥n≥0, is (log2 2N+1!/N!)/N+1, about (2N+1.5)(log2 2N+1.5/e) - (N+1.5)(log2 N+1.5/e) / N+1, or roughly ~ (log2 N+1) + 2 - (log2 e) - trim: thus, just under (log2 N+1) + 0.56 bit.

    The average number of bits in representing the sum and difference, of two numbers N≥n≥0, both together: equals roughly ~ (log2 N+1) + 0.56 + (log2 N+1) - 1.44, and tiny trim: thus, just about 2(log2 N+1) - 0.88 bit (not picking over the duplication at n=0); however the sum-average excludes one bit in the Haar algorithm (as above).

    Thus, at the first level, compared to the average number of bits in representing two raw numbers N,n≥0, the Haar sum-average-and-difference construction takes a half bit more on the average;- but subsequent differences of sum-averages also need a sign bit, not offset by an exclusion, and filling half the whole,- whence the total average is just over one and a half bit more.

    (This appears anomalous, as the total information has not changed.)

    VIDEO EFFICIENCY: The average 1.56-bit loss in the Haar is regained by two components of the video: Digital imaging technology, of fixed pixels not aligned with viewed objects, puts any high contrast edge straddling pixels, and any single-pixel-width line or star, across 2-4 adjacent pixels, for which average pixel-straddling ranges between edgelike, and half-equal, which latter Haar represents in nearly half the bits (Fine line drawing at the Nyquist frequency resolution is attainable only for aligned edges, necessitating computer-graphics-images); also, high density stark contrast (sterling gray) is not the broad usual in day scenes: slow contrasts abound on object-faces: whence the Haar takes advantage of the abundant lower-frequency components in usually similar adjacent pixels.

    However this would be equally or more true in the lowest level haardlets, too:

    First-stage haardlets, are the same Haar sum-average-and-difference construction; At the next stage the haardlets extend sum-and-difference to the differences, requiring one plus the average half bit, more precision, whichever is larger, sum or difference holding the surplus bit of its sum as reconstructed from the difference.

    Localized switching between haarlets and haardlets may further improve video compression: the possibly simplest switch may be a rectolinear array of regional flags, each region 4×4, 8×8, or 16×16 pixels (an implementation-practical optimization), each flag the switch for the region ... an array that itself is highly compressible as image objects span multiple adjacent regions. Implementation precomputes Haard, which includes the Haar, counting the total of coefficient bits needed, by region, and setting the local regional flag to Haar or Haard, to minimize the compressed size.

    SWITCHABLE HAARDLETS: The concept is: The high order bit of a coefficient becomes a decision that the original -haarlet- pair bit-space was larger than if compressed with the haardlet-step with the decision bit included, -and whence the haardlet is taken and demarked (switched)-... which occurs at -1+√17/4 (~0.7807764~~25/32~) for one bit, but is considerable above the small-golden coefficient -1 + √5/2 (~0.6180340~) equator-crossing, where the Haardlet gained reputation.

    Consider again the Haar and Haard transforms as top-down constructions: The Haar takes a single pixel, average,- and differentiates, -taking a bit more precision in the difference:-... the result is a quad of pixels, each its own average; Haar is excellent for astronomy, point stars and galaxies, where each point is its own spanning four pixels unrelated to its adjacent points; but not as much, for planets where atmosphere gases mix, planetary nebula where ionized gas-plasmas mix, even asteroids, where metal or stone once mixed before solidifying; and even less so, for chrystaline and biological grown, and artificial constructed, smooth elements, features, parts: Earth-natural scenes. The Haard is the same except it retains the differences for the next level resolution differences, better matching the latter category of more-homogenous composition. But, the ideal, would distinguish along edges, between Haard and Haar.

    s1+ s2 > 0,s1,s2 or negatively or the difference instead of sum. It would be always true except for the efficiency of coding-out zeros and ones only at high density (at equal density, there is no coding efficiency).

    NOW THE FUN:

    Suppose we allow haardlets to build representational precision up to all available, before switching to the Haar-mode knee (particularly improving larger smooth regions) ... then at the top, its bit is always "1" and we do not store it nor send it ... but the indicator, which also contains some information (it gives us a choice between two transforms, thus maximizing compression) ... and this may outdo SPIHT ... this ballooned SPIHT may be the ultimate of all compressions: it locates and packs the coefficients.

    (stepped Hadamard Transform) Hadamard, by definition, is a "squarewave" block transform, involving only additions and subtractions (one-bit multiplications; much simpler than multiplication-intense Fourier, the sinewave transform). Hadamard can be decimated conveniently rather than computing full hadamard-waves each process cycle ... bottom up, sum-and-difference of pairs correspondingly compounded,- up to its top coefficients:

    An illustration: progressively on samples; pairs; and pairs-of-pairs:

    In essence this computes the total (2n× average) sum, left-right drop, and the partwise left-right drops,... the total average slope,-- thence the changes in slopes. (Inline reordering is simple, too, to group temporal-frequencies in better order.) It is very edge- and slope-detective-- even advantageously better than the Fourier. and not easy to calculate; its major is that it squeezes to a difference sense of optimized

    Haar slightly favors half-alignment (lower spectral frequency), but significantly disfavors quarter-phase: e.g.

    raw data00000000 10000000 00000000 00000000000°
    1st-haar01000000-10000000 00000000+00000000
    2nd-haar00100000-10000000+01000000+00000000
    haard00100000-011000000+01000000-010000000
    raw data00000000 01100000 00100000 00000000090°
    1st-haar00110000-01100000 00010000+00100000
    2nd-haar00100000-01100000+00100000+00100000
    haard00100000-001100000+00100000-010000000
    raw data00000000 01000000 01000000 00000000180°
    1st-haar00100000-01000000 00100000+01000000
    2nd-haar00100000-01000000+00000000+01000000
    haard00100000+000000000+00000000-010000000
    raw data00000000 00100000 01100000 00000000270°
    1st-haar00010000-00100000 00110000+01100000
    2nd-haar00100000-00100000-00100000+01100000
    haard00100000+001100000-00100000-010000000
    The sum-average and the difference-of-differences are constant in this high-contrast example.

    __

    A general note: Expanding an image beyond its pixel-per-pixel resolution, yields an apparent blocky-digital image, unless the smallest slopes are recalculated ... this may result in speckly-like performance at edges between slopes. (Also had a similar concern that could have been improved by smoothing at the single-quantum level: i.e. a difference of one quantum between adjacent pixels, means specially, they are not, different by one but the same with a capture-dither that must be smoothed--- a renderer responsibility.)

    __

    Ideally, original images consist of single photon emitter atoms less than unit-rate each:

    A fully parallel nanoprocessor would:

  • count photons
    • put out corresponding pulses one-per-count
    • correlate adjacent sums and differences
        along lines temporally developed from "learned" linear motions
    • rechannel the same pulses (without in/decrease, just reroute) ...the total output count bandwidth equaling its total input count, rerouted
    Nanoscopic SPIHT:
    • individual discrete photons appear equally in all four coefficients
        this may be smoothed by temporal averaging gathering more photons
    • in a continuum, adjacent values are equal, sums, slopes, twists
      • finite adjacent values are near equal; and distant are disparate
      • better-than-SPIHT-wavelets would adapt for local image business, crossing more smoothly between Haard and Haard-like wavelets
      • this is comparable to the twinning of x- and y-slopes nearer ±45° which coefficient-transitioning would be better as polar aiming: higher-resolution aim than mere single-bit coefficient selection: consider the improvement in taking x,y-selects as one 3-bit polar {0 no-coefficient, 100 x, 101 x+y, 110 x-y, 111 y} in calculation; this may require recto-polar conversion, or a new entropizing: consider the (old) plotter-stepping algorithm:
    It is of interest to note that the base representation of a count of photons, is itself a first-level signal reduction, essentially runlength: the total number of values possible then fits into the arithmetic coding, which I showed was efficiently coded by entropic choreonumeration,- which can be as arithmetic or not, as needed. SPIHT typically uses a Huffman coding.

    REF:
    * [SPIHT was original work by other authors, its trademark now-abandoned but generally retains its acronymic meaning]
    ** [An example of a full-Hadamard system would be the late-1970's NASA Telecomm bit-per-pixel B&W system designed and built by Linkabit Corporation]

    Grand-Admiral Petry
    'Majestic Service in a Solar System'
    Nuclear Emergency Management

    © 2004-2005,2009,2011-2012 GrandAdmiralPetry@Lanthus.net