# cbloom rants

## 12/17/2014

### 12-17-14 - PVQ Vector Distribution Note

So, PVQ, as in "Pyramid VQ" is just a way of making a VQ codebook for a unit vector that has a certain probability model (each value is laplacian (the same laplacian) and independent).

You have a bunch of values, you send the length separately so you're left with a unit vector. Independent values are not equally distributed on the unit sphere, so you don't want a normal unit vector quantizer, you want this one that is scaled squares.

Okay, that's all fine, but the application to images is problematic.

In images, we have these various AC residuals. Let's assume 8x8 blocks for now for clarity. In each block, you have ACij (ij in 0-7 and AC00 excluded). The ij are frequency coordinates for each coefficient. You also have spatial neighbors in the adjacent blocks.

The problem is that the AC's are not independent - their not independent either in frequency or spatial coordinates. AC42 is strongly correlated to AC21 and also to AC42 in the neighboring blocks. They also don't have the same distribution; lower freqency-index AC's have much higher means.

In order to use Pyramid VQ, we need to find a grouping of AC's into a vector, such that the values we are putting in that vector are as uncorrelated and equally distributed as possible.

One paper I found that I linked in the previous email forms a vector by taking all the coefficients in the same frequency slot in a spatial region (for each ij, take all ACij in the 4x4 neighorhood of blocks). This is appealing in the sense that it gathers AC's of the same frequency subband, so they have roughly the same distribution. The problem is there are strong spatial correlations.

In Daala they form "subbands" of the coefficients by grouping together chunks of ACs that are in similar frequency groups.

The reason why correlation is bad is that it makes the PVQ codebook not optimal. For correlated values you should have more codebook vectors with neighboring values similar. eg. more entries around {0, 2, 2, 0} and fewer around {2, 0, 0, 2}. The PVQ codebok assumes those are equally likely.

You can however, make up for this a bit with the way you encode the codebook index. It doesn't fix the quantizer, but it does extract the correlation.

In classical PVQ (P = Pyramid) you would simply form an index to the vector and send it with an equiprobable code. But in practice you might do it with binary subdivision or incremental enumeration schemes, and then you need not make all codebook vectors equiprobable.

For example in Daala one of the issues for the lower subbands is that the vectors that have signal in the low AC's are more probable than the high AC's. eg. for subband that spans AC10 - AC40 , {1,0,0,0} is much more likely than {0,0,0,1}.

Of course this becomes a big mess when you consider Predictive VQ, because the Householder transform scrambles everywhere up in a way that makes it hard to model these built-in skews. On the other hand, if the Predictive VQ removes enough of the correlation with neighbors and subband, then you are left with a roughly evenly distributed vector again which is what you want.

## 12/16/2014

### 12-16-14 - Daala PVQ Emails

First, I want to note that the PVQ Demo page has good links at the bottom with more details in them, worth reading.

Also, the Main Daala page has more links, including the "Intro to Video" series, which is rather more than an intro and is a good read. It's a broad survey of modern video coding.

Now, a big raw dump of emails between me, ryg, and JM Valin. I'm gonna try to color them to make it a bit easier to follow. Thusly :

cbloom
ryg
valin

And this all starts with me being not very clear on PVQ so the beginning is a little fuzzy.

I will be following this up with a "summary of PVQ as I now understand it" which is probably more useful for most people. So, read that, not this.

(also jebus the internet is rindoculuos. Can I have BBS's back? Like literally 1200 baud text is better than the fucking nightmare that the internet has become. And I wouldn't mind playing a little Trade Wars once a day...)

Thanks for writing! Before I address specific points, maybe you can teach me a bit about PVQ and how you use it? I can't find any good resources on the web (your abstract is rather terse). Maybe you can point me at some relevant reference material. (the CELT paper is rather terse too!) Are you constructing the PVQ vector from the various AC's within a single block? Or gathering the same subband from spatial neighbors? (I think the former, but I've seen the latter in papers) Assuming the former - Isn't it just wrong? The various AC's have different laplacian distributions (lower frequencies more likely) so using PVQ just doesn't seem right. In particular PVQ assumes all coefficients are equally likely and equally distributed. In your abstract you seem to describe a coding scheme which is not a uniform length codeword like traditional PVQ. It looks like it assigns shorter codes to vectors that have their values early on in some kind of z-scan order. How is K chosen?

Hi, On 02/12/14 08:52 PM, Charles Bloom wrote: > Thanks for writing! Before I address specific points, maybe you can > teach me a bit about PVQ and how you use it? I can't find any good > resources on the web (your abstract is rather terse). Maybe you can > point me at some relevant reference material. (the CELT paper is rather > terse too!) I'm currently writing a longer paper for a conference in February, but for now there isn't much more than the demo and the abstract I link to at the bottom. I have some notes that describe some of the maths, but it's a bit all over the place right now: http://jmvalin.ca/video/video_pvq.pdf > Are you constructing the PVQ vector from the various AC's within a > single block? Or gathering the same subband from spatial neighbors? (I > think the former, but I've seen the latter in papers) > > Assuming the former - Correct. You can see the grouping (bands) in Fig. 1 of: http://jmvalin.ca/video/spie_pvq_abstract.pdf > Isn't it just wrong? The various AC's have different laplacian > distributions (lower frequencies more likely) so using PVQ just doesn't > seem right. > > In particular PVQ assumes all coefficients are equally likely and > equally distributed. > > > In your abstract you seem to describe a coding scheme which is not a > uniform length codeword like traditional PVQ. It looks like it assigns > shorter codes to vectors that have their values early on in some kind of > z-scan order. One thing to keep in mind if that the P in PVQ now stands for "perceptual". In Daala we are no longer using the indexing scheme from CELT (which does assume identical distribution). Rather, we're using a coding scheme based on Laplace distribution of unequal variance. You can read more about the actual encoding process in another document: http://jmvalin.ca/video/pvq_encoding.pdf > How is K chosen? The math is described (poorly) in section 6.1 of http://jmvalin.ca/video/video_pvq.pdf Basically, the idea is to have the same resolution in the direction of the gain as in any other direction. In the no prediction case, it's roughly proportional to the gain times the square root of the number of dimensions. Because K only depends on values that are available to the decoder, we don't actually need to signal it. Hope this helps, Jean-Marc

Thanks for the responses and the early release papers, yeah I'm figuring most of it out. K is chosen so that distortion from the PVQ (P = Pyramid) quantization is the same as distortion from gain quantization. Presumably under a simple D metric like L2. The actual PVQ (P = Pyramid) part is the simplest and least ambiguous. The predictive stuff is complex. Let me make sure I understand this correctly - You never actually make a "residual" in the classic sense by subtracting the prediction off. You form the prediction in transformed space. (perhaps by having a motion vector, taking the pixels it points to and transforming them, dealing with lapping, yuck!) The gain of the current block is sent (for each subband). Not the gain of the delta. The gain of the prediction in the same band is used as coding context? (the delta of the quantized gains could be sent). The big win that you guys were after in sending the gain seems to have been the non-linear quantization levels; essentially you're getting "variance adaptive quantization" without explicitly sending per block quantizers. The Householder reflection is the way that vectors near the prediction are favored. This is the only way that the predicted block is used!? Madness! (presumably if the prediction had detail that was finer than the quantization level of the current block that could be used to restore within the quantization bucket; eg. for "golden frames")

On 03/12/14 12:18 AM, Charles Bloom wrote: > K is chosen so that distortion from the PVQ (P = Pyramid) quantization > is the same as distortion from gain quantization. Presumably under a > simple D metric like L2. Yes, it's an L2 metric, although since the gain is already warped, the distortion is implicitly weighted by the activity masking, which is exactly what we want. > You never actually make a "residual" in the classic sense by subtracting > the prediction off. Correct. > You form the prediction in transformed space. (perhaps by having a > motion vector, taking the pixels it points to and transforming them, > dealing with lapping, yuck!) We have the input image and we have a predicted image. We just transform both. Lapping doesn't actually cause any issues there (unlike many other places). As far as I can tell, this part is similar to what a wavelet coder would do. > The gain of the current block is sent (for each subband). Not the gain > of the delta. Correct. > The gain of the prediction in the same band is used as > coding context? (the delta of the quantized gains could be sent). Yes, the gain is delta-coded, so coding "same gain" is cheap. Especially, there's a special symbol for gain=0,theta=0, which means "skip this band and use prediction as is". > The big win that you guys were after in sending the gain seems to have > been the non-linear quantization levels; essentially you're getting > "variance adaptive quantization" without explicitly sending per block > quantizers. Exactly. Not only that but it's adaptive based on the variance of the current band, not just an entire macroblock. > The Householder reflection is the way that vectors near the prediction > are favored. This is the only way that the predicted block is used!? > Madness! Well, the reference is used to compute the reflection *and* the gain. In the end, we're using exactly the same amount of information, just in a different space. > (presumably if the prediction had detail that was finer than the > quantization level of the current block that could be used to restore > within the quantization bucket; eg. for "golden frames") Can you explain what you mean here? Jean-Marc

So one thing that strikes me is that at very low bit rate, it would be nice to go below K=1. In the large high-frequency subbands, the vector dimension N is very large, so even at K=1 it takes a lot of bits to specify where the energy should go. It would be nice to be more lossy with that location. It seems that for low K you're using a zero-runlength coder to send the distribution, with a kind of Z-scan order, which makes it very similar to standard MPEG. (maybe you guys aren't focusing on such low bit rates; when I looked at low bit rate video the K=1 case dominated) At 09:42 PM 12/2/2014, you wrote: > (presumably if the prediction had detail that was finer than the > quantization level of the current block that could be used to restore > within the quantization bucket; eg. for "golden frames") Can you explain what you mean here? If you happen to have a very high quality previous block (much better than your current quantizer / bit rate should give you) - with normal mocomp you can easily carry that block forward, and perhaps apply corrections to it, but the high detail of that block is preserved. With the PVQ scheme it's not obvious to me that that works. When you send the quantized gain of the subbands you're losing precision (it looks like you guys have a special fudge to fix this, by offsetting the gain based on the prediction's gain?) But for the VQ part, you can't really "carry forward" detail in the same way. I guess the reflection vector can be higher precision than the quantizer, so in a sense that preserves detail, but it doesn't carry forward the same values, because they drift due to rotation and staying a unit vector, etc.

Some more questions - Is the Householder reflection method also used for Intra prediction? (do you guys do the directional Intra like H26x ?) How much of this scheme is because you believe it's the best thing to do vs. you have to avoid H26x patents? If you're not sending any explicit per-block quantizer, it seems like that removes a lot of freedom for future encoders to do more sophisticated perceptual optimization. (ROI bit allocation or whatever)

On 03/12/14 02:17 PM, Charles Bloom wrote: > So one thing that strikes me is that at very low bit rate, it would be > nice to go below K=1. In the large high-frequency subbands, the vector > dimension N is very large, so even at K=1 it takes a lot of bits to > specify where the energy should go. It would be nice to be more lossy > with that location. Well, for large N, the first gain step already has K>1, which I believe is better than K=1. I've considered adding an extra gain step with K=1 or below, but never had anything that was really worth it (didn't try very hard). > It seems that for low K you're using a zero-runlength coder to send the > distribution, with a kind of Z-scan order, which makes it very similar > to standard MPEG. > > (maybe you guys aren't focusing on such low bit rates; when I looked at > low bit rate video the K=1 case dominated) We're also targeting low bit-rates, similar to H.265. We're not yet at our target level of performance though. > Is the Householder reflection method also used for Intra prediction? > (do you guys do the directional Intra like H26x ?) We also use it for intra prediction, though right now our intra prediction is very limited because of the lapped transform. Except for chroma which we predict from the luma. PVQ makes this particularly easy. We just use the unit vector from luma as chroma prediction and code the gain. > How much of this scheme is because you believe it's the best thing to > do vs. you have to avoid H26x patents? The original goal wasn't to avoid patents, but it's a nice added benefit. > If you're not sending any explicit per-block quantizer, it seems like > that removes a lot of freedom for future encoders to do more > sophisticated perceptual optimization. (ROI bit allocation or > whatever) We're still planning on adding some per-block/macroblock/something quantizers, but we just won't need them for activity masking. Cheers, Jean-Marc

Hi, Just read your "smooth blocks" post and I thought I'd mention on thing we do in Daala to improve the quality of smooth regions. It's called "Haar DC" and the idea is basically to apply a Haar transforms to all the DCs in a superblock. This has the advantage of getting us much better quantization resolution at large scales. Unfortunately, there's absolutely no documentation about it, so you'd have to look at the source code, mostly in od_quantize_haar_dc() and a bit of od_compute_dcts() http://git.xiph.org/?p=daala.git;a=blob;f=src/encode.c;h=879dda;hb=HEAD Cheers, Jean-Marc

Yeah I definitely can't follow that code without digging into it. But this : "much better quantization resolution at large scales." is interesting. When I did the DLI test : http://cbloomrants.blogspot.com/2014/08/08-31-14-dli-image-compression.html something I noticed in both JPEG and DLI (and in everything else, I'm sure) is : Because everyone just does naive scalar quantization on DC's, large regions of solid color will shift in a way that is very visible. That is, it's a very bad perceptual RD allocation. Some bits should be taken away from AC detail and put into making that large region DC color more precise. The problem is that DC scalar quantization assumes the blocks are independent and random and so on. It models the distortion of each block as being independent, etc. But it's not. If you have the right scalar quantizer for the DC when the blocks are in a region of high variation (lots of different DC's) then that is much too large a quantizer for regions where blocks all have roughly the same DC. This is true even when there is a decent amount of AC energy, eg. the image I noticed it in was the "Porsche640" test image posted on that page - the greens of the bushes all color shift in a very bad way. The leaf detail does not mask this kind of perceptual error.

Two more questions - 1. Do you use a quantization matrix (ala JPEG CSF or whatever) ? If so, how does that work with gain preservation and the Pyramid VQ unit vector? 2. Do you mind if I post all these mails publicly?

On 11/12/14 02:03 PM, Charles Bloom wrote: > 1. Do you use a quantization matrix (ala JPEG CSF or whatever) ? If so, > how does that work with gain preservation and the Pyramid VQ unit vector? Right now, we just set a different quantizer value for each "band", so we can't change resolution on a coefficient-by-coefficient basis, but it still looks like a good enough approximation. If needed we might try doing something fancier at some point. > 2. Do you mind if I post all these mails publicly? I have no problem with that and in fact I encourage you to do so. Cheers, Jean-Marc

ryg: Don't wanna post this to your blog because it's a long comment and will probably fail Blogger's size limit. Re "3 1/2. The normal zig-zag coding schemes we use are really bad." Don't agree here about zig-zag being the problem. Doesn't it just boil down to what model you use for the run lengths? Classic JPEG/MPEG style coding rules (H.264 and later are somewhat different) 1. assume short runs are more probable than long ones and 2. give a really cheap way to end blocks early. The result is that the coder likes blocks with a fairly dense cluster in the first few coded components (and only this is where zig-zag comes in) and truncated past that point. Now take Fischer-style PVQ (original paper is behind a paywall, but this: http://www.nul.com/pbody17.pdf covers what seems to be the proposed coding scheme). You have two parameters, N and K. N is the dimensionality of the data you're coding (this is a constant at the block syntax level and not coded) and K is the number of unit pulses (=your "energy"). You code K and then send an integer (with a uniform model!) that says which of all possible arrangements of K unit pulses across N dimensions you mean. For 16-bit ACs in a 8x8 block so N=63, there's on the order of 2^(63*16) = 2^1008 different values you could theoretically code, so clearly for large K this integer denoting the configuration can get quite huge. Anyway, suppose that K=1 (easiest case). Then the "configuration number" will tell us where the pulse goes and what sign it has, uniformly coded. That's essentially a run length with *uniform* distribution plus sign. K=2: we have two pulses. There's N*2 ways to code +-2 in one AC and the rest zeros (code AC index, code sign), and (N choose 2) * 2^2 ways to code two slots at +-1 each. And so forth for higher K. From there, we can extrapolate what the general case looks like. I think the overall structure ends up being isomorphic to this: 1. You code the number M (<=N) of nonzero coefficients using a model derived from the combinatorics given N and K (purely counting-based). (K=1 implies M=1, so nothing to code in that case.) 2. Code the M sign bits. 3. Code the positions of the M nonzero coeffs - (N choose M) options here. 4. Code another number denoting how we split the K pulses among the M coeffs - that's an integer partition of K into exactly M parts, not sure if there's a nice name/formula for that. This is close enough to the structure of existing AC entropy coders that we can meaningfully talk about the differences. 1) and 2) are bog-standard (we use a different model knowing K than a regular codec that doesn't know K would, but that's it). You can view 3) in terms of significance masks, and the probabilities have a reasonably simple form (I think you can adapt the Reservoir sampling algorithm to generate them) - or, by looking at the zero runs, in term of run lengths. And 4) is a magnitude coder constrained by knowing the final sum of everything. So the big difference is that we know K at the start, which influences our choice of models forthwith. But it's not actually changing the internal structure that much! That said, I don't think they're actually doing Fischer-style PVQ of "just send a uniform code". The advantage of breaking it down like above is that you have separate syntax elements that you can apply additional modeling on separately. Just having a giant integer flat code is not only massively unwieldy, it's also a bit of a dead end as far as further modeling is concerned. -Fabian

cbloom: At 01:36 PM 12/2/2014, you wrote: Don't wanna post this to your blog because it's a long comment and will probably fail Blogger's size limit. Re "3 1/2. The normal zig-zag coding schemes we use are really bad." Don't agree here about zig-zag being the problem. Doesn't it just boil down to what model you use for the run lengths? My belief is that for R/D optimization, it's bad when there's a big R step that doesn't correspond to a big D step. You want the prices of things to be "fair". So the problem is cases like : XX00 X01 0 vs XX00 X001 000 00 0 which is not a very big D change at all, but is a very big R step. I think it's easy to see that even keeping something equivalent to the zigzag, you could change it so that the position of the next coded value is sent in a way such that the rates better match entropy and distortion. But of course, really what you want is to send the positions of those later values in a lossy way. Even keeping something zigzagish you can imagine easy ways to do it, like you send a zigzag RLE that's something like {1,2,3-4,5-7,8-13} whatever.

ryg: Actually the Fischer (magnitude enumeration) construction corresponds pretty much directly to a direct coder: from the IEEE paper, l = dim, k = number of pulses, then number of code words N(l,k) is N(l,k) = sum_{i=-k}^k N(l-1, k-|i|) This is really direct: N(l,k) just loops over all possible values i for the first AC coeff. The remaining uncoded ACs then are l-1 dimensional and <= k-|i|. Divide through by N(l,k) and you have a probability distribution for coding a single AC coeff. Splitting out the i=0 case and sign, we get: N(l,k) = N(l-1,k) + 2 * sum_{j=1}^k N(l-1,k-j) =: N(l-1,k) + 2 * S(l-1,k) which corresponds 1:1 to this encoder: // While energy (k) left for (i = 0; k > 0; i++) { { assert(i < Ndims); // shouldn't get to N with leftover energy int l = N - i; // remaining dims int coeff = coeffs[i]; // encode significance code_binary(coeff == 0, N(l-1,k) / N(l,k)); if (coeff != 0) { // encode sign code_binary(coeff < 0, 0.5); int mag = abs(coeff); // encode magnitude (multi-symbol) // prob(mag=j) = N(l-1,k-j) / S(l-1, k) // then: k -= mag; } } and this is probably how you'd want to implement it given an arithmetic back end anyway. Factoring it into multiple decisions is much more convenient (and as said before, easier to do secondary modeling on) than the whole "one giant bigint" mess you get if you're not low-dimensional. Having the high-dimensional crap in there blows because the probabilities can get crazy. Certainly Ndims=63 would suck to work with directly. Separately, I'd expect that for k "large" (k >= Ndims? Ndims*4? More? Less?) you can use a simpler coder and/or fairly inaccurate probabilities because that's gonna be infrequent. Maybe given k = AC_sum(1,63) = sum_{i=1}^63 |coeff_i|, there's a reasonably nice way to figure out say AC_sum(1,32) and AC_sum(33,63). And if you can do that once, you can do it more than once. Kind of a top-down approach: you start with "I have k energy for this block" and first figure out which subband groups that energy goes into. Then you do the "detail" encode like above within each subband of maybe 8-16 coeffs; with l<=Ndim<=8 and k<=Ndim*small, you would have reasonable (practical) model sizes. -Fabian

cbloom: No, I don't think that's right. The N recursion is just for counting the number of codewords, it doesn't imply a coding scheme. It explicitly says that the pyramid vector index is coded with a fixed length word, using ceil( N ) bits. Your coding scheme is variable length. I need to find the original Fisher paper because this isn't making sense to me. The AC's aren't equally probable and don't have the same Laplacian distribution so PVQ just seems wrong. I did find this paper ("Robust image and video coding with pyramid vector quantisation") which uses PVQ and is making the vectors not from within the same block, but within the same *subband* in different spatial locations. eg. gathering all the AC20's from lots of neigboring blocks. That does make sense to me but I'm not sure if that's what everyone means when they talk about PVQ ? (paper attached to next email)

ryg: On 12/2/2014 5:46 PM, Charles Bloom {RAD} wrote: No, I don't think that's right. The N recursion is just for counting the number of codewords, it doesn't imply a coding scheme. It explicitly says that the pyramid vector index is coded with a fixed length word, using ceil( N ) bits. Your coding scheme is variable length. I wasn't stating that Fischer's scheme is variable-length; I was stating that the decomposition as given implies a corresponding way to encode it that is equivalent (in the sense of exact same cost). It's not variable length. It's variable number of symbols but the output length is always the same (provided you use an exact multi-precision arithmetic coder that is, otherwise it can end up larger due to round-off error). log2(N(l,k)) is the number of bits we need to spend to encode which one out of N(l,k) equiprobable codewords we use. The ceil(log2(N)) is what you get when you say "fuck it" and just round it to an integral number of bits, but clearly that's not required. So suppose we're coding to the exact target rate using a bignum rationals and an exact arithmetic coder. Say I have a permutation of 3 values and want to encode which one it is. I can come up with a canonical enumeration (doesn't matter which) and send an index stating which one of the 6 candidates it is, in log2(6) bits. I can send one bit stating whether it's an even or odd permutation, which partitions my 6 cases into 2 disjoint subsets of 3 cases each, and then send log2(3) bits to encode which of the even/odd permutations I am, for a total of log2(2) + log2(3) = log2(6) bits. Or I can get fancier. In the general case, I can (arbitrarily!) partition my N values into disjoint subsets with k_1, k_2, ..., k_m elements, respectively, sum_i k_i = N. To code a number, I then first code the number of the subset it's in (using probability p_i = k_i/N) and then send a uniform integer denoting which element it is, in log2(k_i) bits. Say I want to encode some number x, and it falls into subset j. Then I will spend -log2(p_i) + log2(k_i) = -log2(k_i / N) + log2(k_i) = log2(N / k_i) + log2(k_i) = log2(N) bits (surprise... not). I'm just partitioning my uniform distribution into several distributions over smaller sets, always setting probabilities exactly according to the number of "leaves" (=final coded values) below that part of the subtree, so that the product along each path is still a uniform distribution. I can nest that process of course, and it's easy to do so in some trees but not others meaning I get non-uniform path lengths, but at no point am I changing the size of the output bitstream. That's exactly what I did in the "coder" given below. What's the value of the first AC coefficient? It must obey -k <= ac_0 <= k per definition of k, and I'm using that to partition our codebook C into 2k+1 disjoint subsets, namely C_x = { c in C | ac0(c) = x } and nicely enough, by the unit-pulse definition that leads to the enumeration formula, each of the C_x corresponds to another PVQ codebook, namely with dimension l-1 and energy k-|x|. Which implies the whole thing decomposes into "send x and then do a PVQ encode of the rest", i.e. the loop I gave. That said, one important point that I didn't cover in my original mail: from the purposes of coding this is really quite similar to a regular AC coder, but of course the values being coded don't mean the same thing. In a JPEG/MPEG style entropy coder, the values I'm emitting are raw ACs. PVQ works (for convenience) with code points on an integer lattice Z^N, but the actual AC coeffs coded aren't those lattice points, they're (gain(K) / len(lattice_point)) * lattice_point (len here being Euclidean and not 1-norm!). I need to find the original Fisher paper because this isn't making sense to me. The AC's aren't equally probable and don't have the same Laplacian distribution so PVQ just seems wrong. I did find this paper ("Robust image and video coding with pyramid vector quantisation") which uses PVQ and is making the vectors not from within the same block, but within the same *subband* in different spatial locations. eg. gathering all the AC20's from lots of neigboring blocks. That does make sense to me but I'm not sure if that's what everyone means when they talk about PVQ ? (paper attached to next email) The link to the extended abstract for the Daala scheme (which covers this) is on the Xiph demo page: http://jmvalin.ca/video/spie_pvq_abstract.pdf Page 2 has the assignment of coeffs to subbands. They're only using a handful, and notably they treat 4x4 blocks as a single subband. -Fabian

cbloom: Ah yeah, you are correct of course. I didn't see how you had the probabilities in the coding. There are a lot of old papers I can't get about how to do the PVQ enumeration in an efficient way. I'm a bit curious about what they do. But as I'm starting to understand it all a bit now, that just seems like the least difficult part of the problem. Basically the idea is something like - divide the block into subbands. Let's say the standard wavelet tree for concreteness - 01447777 23447777 55667777 55667777 8.. Send the sum in each subband ; this is the "gain" ; let's say g_s g_s is sent with some scalar quantizer (how do you choose q_s ?) (in Daala a non-linear quantizer is used) For each subband, scale the vector to an L1 length K_s (how do you choose K_s?) Quantize the vector to a PVQ lattice point; send the lattice index So PVQ (P = Pyramid) solves this problem of how to enumerate the distribution given the sum. But that's sort of the trivial part. The how do you send the subband gains, what is K, etc. is the hard part. Do the subband gains mask each other? Then there's the whole issue of PVQ where P = Predictive. This Householder reflection business. Am I correct in understanding that Daala doesn't subtract off the motion prediction and make a residual? The PVQ (P = predictive) scheme is used instead? That's quite amazing. And it seems that Daala sends the original gain, not the gain of the residual (and uses the gain of the prediction as context). The slides (reference #4) clear things up a bit.

ryg: That means your points are on a sphere. You do a reflection that aligns your prediction vector with the 1st AC coefficient. This rotates (well, reflects...) everything around but your block is still a unit vector on a sphere. Important note for this and all that follows: For this to work as I described it, your block and the prediction need to be in the same space, which in this context has to be frequency (DCT) space (since that's what you eventually want to code with PVQ), so you need to DCT your reference block first. This combined with the reflections etc. make this pretty pricey, all things considered. If you weren't splitting by subbands, I believe you could finesse your way around this: (normalized) DCT and Householder reflections are both unitary, so they preserve both the L2 norm and dot products. Which means you could calculate both the overall gain and the correlation coeffs for your prediction *before* you do the DCT (and hence in the decoder, add that stuff back in post-IDCT, without having to DCT your reference). But with the subband splitting, that no longer works, at least not directly. You could still do it with a custom filter bank that just passes through precisely the DCT coeffs we're interested in for each subband, but eh, somehow I have my doubts that this is gonna be much more efficient than just eating the DCT. It would certainly add yet another complicated mess to the pile. -Fabian

cbloom: At 10:23 PM 12/2/2014, Fabian Giesen wrote: For this to work as I described it, your block and the prediction need to be in the same space, which in this context has to be frequency (DCT) space (since that's what you eventually want to code with PVQ), so you need to DCT your reference block first. This combined with the reflections etc. make this pretty pricey, all things considered. Yeah, I asked Valin about this. They form an entire predicted *image* rather than block-by-block because of lapping. They transform the predicted image the same way as the current frame. Each subband gain is sent as a delta from the predicted image subband gain. Crazy! His words : > You form the prediction in transformed space. (perhaps by having a > motion vector, taking the pixels it points to and transforming them, > dealing with lapping, yuck!) We have the input image and we have a predicted image. We just transform both. Lapping doesn't actually cause any issues there (unlike many other places). As far as I can tell, this part is similar to what a wavelet coder would do.

ryg: Yeah, I asked Valin about this. They form an entire predicted *image* rather than block-by-block because of lapping. That doesn't have anything to do with the lapping, I think - that's because they don't use regular block-based mocomp. At least their proposal was to mix overlapping-block MC and Control Grid Interpolation (CGI, essentially you specify a small mesh with texture coordinates and do per-pixel tex coord interpolation). There's no nice way to do this block-per-block in the first place, not with OBMC in the mix anyway; if you chop it up into tiles you end up doing a lot of work twice.

cbloom: The other thing I note is that it doesn't seem very awesome at low bit rate. Their subband chunks are very large. Even at K=1 the N slots that could have that one value is very large, so sending the index of that one slot is a lot of bits. At that point, the way you model the zeros and the location of the 1 is the most important thing. What I'm getting at is a lossy way of sending that.

ryg: On 12/3/2014 1:04 PM, Charles Bloom {RAD} wrote: The other thing I note is that it doesn't seem very awesome at low bit rate. Their subband chunks are very large. Even at K=1 the N slots that could have that one value is very large, so sending the index of that one slot is a lot of bits. Yeah, the decision to send a subband *at all* means you have to code gain, theta and your AC index. For N=16 that's gonna be hard to get below 8 bits even for trivial signals. At which point you get a big jump in the RD curve, which is bad. Terriberry has a few slides that explain how they're doing inter-band activity masking currently: https://people.xiph.org/~tterribe/daala/pvq201404.pdf The example image is kind of terrible though. The "rose" dress (you'll see what I mean) is definitely better in the AM variant, but the rest is hard to tell for me unless I zoom in, which is cheating. At that point, the way you model the zeros and the location of the 1 is the most important thing. What I'm getting at is a lossy way of sending that. This is only really interesting at low K, where the PVQ codebook is relatively small. So, er, let's just throw this one in: suppose you're actually sending codebook indices. You just have a rate allocation function that tells you how many bits to send, independent of how big the codebook actually is. If you truly believe that preserving narrowband energy is more important than getting the direction right, then getting a random vector with the right energy envelope is better than nothing. Say K=1, Ndim=16. You have N=32 codewords, so a codebook index stored directly is 5 bits. Rate function says "you get 0 bits". So you don't send an index at all, and the decoder just takes codeword 0. Or rate function says "you get 2 bits" so you send two bits of the codebook index, and take the rest as zero. This is obviously biased. So the values you send aren't raw codebook indices. You have some random permutation function family p_x(i) : { 0, ..., N-1 } -> { 0, ..., N-1 } where x is a per-block value that both the encoder and decoder know (position or something), and what you send is not the codebook id but p_x(id). For any given block (subband, whatever), this doesn't help you at all. You either guess right or you guess wrong. But statistically, suppose you shaved 2 bits off the codebook IDs for 1000 blocks. Then you'd expect about 250 of these blocks to reconstruct the right ACs. For the rest, you reconstructed garbage ACs, but it's garbage with the right energy levels at least! :) No clue if this is actually a good idea at all. It definitely allows you to remove a lot of potholes from the RD curve. -Fabian

ryg: On 12/3/2014 1:48 PM, Fabian Giesen wrote: > [..] This is obviously biased. So the values you send aren't raw codebook indices. You have some random permutation function family p_x(i) : { 0, ..., N-1 } -> { 0, ..., N-1 } where x is a per-block value that both the encoder and decoder know (position or something), and what you send is not the codebook id but p_x(id). Now this is all assuming you either get the right code or you get garbage, and living with whichever one it is. You can also go in the other direction and try to get the direction at least mostly right. You can try to determine an ordering of the code book so that distortion more or less smoothly goes down as you add extra bits. (First bit tells you which hemisphere, that kind of thing.) That way, if you get 4 bits out of 5, it's not a 50:50 chance between right vector and some random other vector, it's either the right vector or another vector that's "close". (Really with K=1 and high dim it's always gonna be garbage, though, because you just don't have any other vector in the code book that's even close; this is more interesting at K=2 or up). This makes the per-block randomization (you want that to avoid systematic bias) harder, though. One approach that would work is to do a Householder reflection with a random vector (again hashed from position or similar). All that said, I don't believe in this at all. It's "solving" a problem by "reducing" it to a more difficult unsolved problem (in this case, "I want a VQ codebook that's close to optimal for embedded coding"). Of course, even if you do a bad job here, it's still not gonna be worse than the direct "random permutation" stuff. But I doubt it's gonna be appreciably better either, and it's definitely more complex. -Fabian

cbloom: At 02:17 PM 12/3/2014, Fabian Giesen wrote: That way, if you get 4 bits out of 5, it's not a 50:50 chance between right vector and some random other vector, it's either the right vector or another vector that's "close". Yes, this is the type of scheme I imagine. Sort of like a wavelet significant bit thing. As you send fewer bits the location gets coarser. The codebook for K=1 is pretty obvious. You're just sending a location; you want the top bits to grossly classify the AC and the bottom bits to distinguish neighbors (H neighbors for H-type AC's, and V neighbors for V-type AC's) For K=2 and up it's more complex. You could just train them and store them (major over-train risk) up to maybe K=3 but then you have to switch to an algorithmic method. Really the only missing piece for me is how you get the # of bits used to specify the locations. It takes too many bits to actually send it, so it has to be implicit from some other factors like the block Q and K and I'm not sure how to get that.

## 12/08/2014

### 12-08-14 - BPG

BPG is a nice packaging of HEVC (H265) I-frame compression for still images. He provides Windows command line tools with reasonable options (yay!), so I'm quite happy to test it.

It's pretty dang slow. Really slow. It's all covered by many patents. So I'm not sure how realistic it is as a useable format. Nonetheless, it's very useful as something to compare against.

I ran with default options (YCbCr in 420). I compared against JPEG_pdec, which as I previously noted JPEG pdec is very comparable to DLI . (JPEG_pdec = JPEG+packjpg+ my jpegdec (decblocker, etc)).

Conclusion :

BPG is really good. The best I've seen. It kills JPEG-pdec in RMSE, in fact I think it's the best RMSE performance I've seen despite being at a disadvantage (YCbCr and 420). Under the perceptual metrics (MS-SSIM-Y and "Combo") it doesn't win so strongly. That tells me there is probably room for better perceptual tuning of bit allocation and quantizers. But it's definitely strong.

Quick visual evaluation by me :

Porsche640 : BPG wins pretty hard here. Perhaps the most noticeable thing is much better detail preservation in the texture regions (the gravel and bushes). It also does a better job on the edges of the car, it doesn't smear them into nasty DCT block artifacts. You may download Porsche640 comparison images here (1 MB, RAR)

Moses : actually not a very big win here. It does much better at preserving the smooth gradient background (my current JPEGdec doesn't have any special modes for big smooth areas). Visually the main thing you'll notice is that the smooth gradients are nasty chunky steps with JPEG and are nice and smooth with BPG. Other than that, I actually think JPEG is better on Moses himself. Both make a big perceptual rate allocation mistake and put too many bits on the jacket texture and not enough on the human skin texture. But JPEG preserves more of the face texture; when you A/B compare it's clear that BPG is way over-smoothing his face. Particularly on the forehead and the neck fat. But all over really. Both BPG and JPEG make a classic mistake on Moses : they kill too much of the red and blue detail in the tie because it's in chroma.

The raw reports :

porsche640 :

imdiff RMSE_RGB
Built Aug 30 2014 11:30:27
r:\porsche640.bmp
r:\porsche640.bmp_bpg
r:\porsche640.bmp_jpg_pdec

raw imdiff data : -2.31,-1.29,-0.42,-0.13,0.02,0.31,0.69,0.92,1.56,1.93|18.55,13.46,9.59,8.46,7.87,6.79,5.45,4.72,3.14,2.56|-2.26,-1.96,-1.44,-1.13,-0.89,-0.71,-0.44,-0.24,-0.04,0.08,0.21,0.35,0.54,0.76,1.06,1.55,2.54|22.22,19.98,16.74,14.86,13.60,12.73,11.49,10.62,9.82,9.35,8.81,8.30,7.63,6.87,5.91,4.57,3.07| fit imdiff data : -2.31,-1.29,-0.42,-0.13,0.02,0.31,0.69,0.92,1.56,1.93|4.64,5.17,5.69,5.87,5.96,6.16,6.43,6.60,7.02,7.20|-2.26,-1.96,-1.44,-1.13,-0.89,-0.71,-0.44,-0.24,-0.04,0.08,0.21,0.35,0.54,0.76,1.06,1.55,2.54|4.33,4.52,4.82,5.01,5.16,5.26,5.42,5.54,5.65,5.72,5.81,5.89,6.01,6.14,6.33,6.63,7.04|

 r:\porsche640.bmp_bpg r:\porsche640.bmp_jpg_pdec -2.310328 bpg_test_000007742.bmp -2.256377 jpeg_pdec_000008037.bmp -1.290066 bpg_test_000015703.bmp -1.956043 jpeg_pdec_000009897.bmp -0.422974 bpg_test_000028642.bmp -1.443264 jpeg_pdec_000014121.bmp -0.129923 bpg_test_000035093.bmp -1.125039 jpeg_pdec_000017606.bmp 0.023995 bpg_test_000039044.bmp -0.893289 jpeg_pdec_000020674.bmp 0.310310 bpg_test_000047615.bmp -0.710473 jpeg_pdec_000023467.bmp 0.686454 bpg_test_000061798.bmp -0.442294 jpeg_pdec_000028261.bmp 0.920750 bpg_test_000072695.bmp -0.235261 jpeg_pdec_000032622.bmp 1.557974 bpg_test_000113065.bmp -0.039573 jpeg_pdec_000037361.bmp 1.930195 bpg_test_000146345.bmp 0.079876 jpeg_pdec_000040586.bmp 0.213736 jpeg_pdec_000044532.bmp 0.348876 jpeg_pdec_000048905.bmp 0.535742 jpeg_pdec_000055668.bmp 0.758401 jpeg_pdec_000064958.bmp 1.063934 jpeg_pdec_000080280.bmp 1.546419 jpeg_pdec_000112163.bmp 2.541619 jpeg_pdec_000223581.bmp
imdiff MS_SSIM_IW_Y
Built Aug 30 2014 11:30:27
r:\porsche640.bmp
r:\porsche640.bmp_bpg
r:\porsche640.bmp_jpg_pdec

raw imdiff data : -2.31,-1.29,-0.42,-0.13,0.02,0.31,0.69,0.92,1.56,1.93|83.03,89.78,93.93,95.07,95.57,96.43,97.39,97.88,98.87,99.25|-2.26,-1.96,-1.44,-1.13,-0.89,-0.71,-0.44,-0.24,-0.04,0.08,0.21,0.35,0.54,0.76,1.06,1.55,2.54|79.88,82.47,86.76,89.12,90.67,91.72,93.14,94.07,94.84,95.30,95.74,96.18,96.72,97.27,97.95,98.70,99.40| fit imdiff data : -2.31,-1.29,-0.42,-0.13,0.02,0.31,0.69,0.92,1.56,1.93|4.23,5.06,5.73,5.96,6.07,6.29,6.57,6.73,7.18,7.41|-2.26,-1.96,-1.44,-1.13,-0.89,-0.71,-0.44,-0.24,-0.04,0.08,0.21,0.35,0.54,0.76,1.06,1.55,2.54|3.90,4.17,4.66,4.97,5.19,5.35,5.59,5.76,5.91,6.01,6.11,6.22,6.36,6.53,6.76,7.08,7.53|

 r:\porsche640.bmp_bpg r:\porsche640.bmp_jpg_pdec -2.310328 bpg_test_000007742.bmp -2.256377 jpeg_pdec_000008037.bmp -1.290066 bpg_test_000015703.bmp -1.956043 jpeg_pdec_000009897.bmp -0.422974 bpg_test_000028642.bmp -1.443264 jpeg_pdec_000014121.bmp -0.129923 bpg_test_000035093.bmp -1.125039 jpeg_pdec_000017606.bmp 0.023995 bpg_test_000039044.bmp -0.893289 jpeg_pdec_000020674.bmp 0.310310 bpg_test_000047615.bmp -0.710473 jpeg_pdec_000023467.bmp 0.686454 bpg_test_000061798.bmp -0.442294 jpeg_pdec_000028261.bmp 0.920750 bpg_test_000072695.bmp -0.235261 jpeg_pdec_000032622.bmp 1.557974 bpg_test_000113065.bmp -0.039573 jpeg_pdec_000037361.bmp 1.930195 bpg_test_000146345.bmp 0.079876 jpeg_pdec_000040586.bmp 0.213736 jpeg_pdec_000044532.bmp 0.348876 jpeg_pdec_000048905.bmp 0.535742 jpeg_pdec_000055668.bmp 0.758401 jpeg_pdec_000064958.bmp 1.063934 jpeg_pdec_000080280.bmp 1.546419 jpeg_pdec_000112163.bmp 2.541619 jpeg_pdec_000223581.bmp
imdiff Combo
Built Aug 30 2014 11:30:27
r:\porsche640.bmp
r:\porsche640.bmp_bpg
r:\porsche640.bmp_jpg_pdec

raw imdiff data : -2.31,-1.29,-0.42,-0.13,0.02,0.31,0.69,0.92,1.56,1.93|4.83,4.05,3.41,3.21,3.10,2.90,2.63,2.46,2.03,1.80|-2.26,-1.96,-1.44,-1.13,-0.89,-0.71,-0.44,-0.24,-0.04,0.08,0.21,0.35,0.54,0.76,1.06,1.55,2.54|5.30,4.97,4.44,4.11,3.88,3.72,3.48,3.31,3.16,3.06,2.96,2.86,2.72,2.56,2.35,2.05,1.74| fit imdiff data : -2.31,-1.29,-0.42,-0.13,0.02,0.31,0.69,0.92,1.56,1.93|4.19,4.99,5.65,5.86,5.98,6.18,6.45,6.62,7.05,7.28|-2.26,-1.96,-1.44,-1.13,-0.89,-0.71,-0.44,-0.24,-0.04,0.08,0.21,0.35,0.54,0.76,1.06,1.55,2.54|3.69,4.04,4.59,4.93,5.17,5.34,5.58,5.75,5.91,6.01,6.11,6.22,6.36,6.52,6.74,7.03,7.34|

 r:\porsche640.bmp_bpg r:\porsche640.bmp_jpg_pdec -2.310328 bpg_test_000007742.bmp -2.256377 jpeg_pdec_000008037.bmp -1.290066 bpg_test_000015703.bmp -1.956043 jpeg_pdec_000009897.bmp -0.422974 bpg_test_000028642.bmp -1.443264 jpeg_pdec_000014121.bmp -0.129923 bpg_test_000035093.bmp -1.125039 jpeg_pdec_000017606.bmp 0.023995 bpg_test_000039044.bmp -0.893289 jpeg_pdec_000020674.bmp 0.310310 bpg_test_000047615.bmp -0.710473 jpeg_pdec_000023467.bmp 0.686454 bpg_test_000061798.bmp -0.442294 jpeg_pdec_000028261.bmp 0.920750 bpg_test_000072695.bmp -0.235261 jpeg_pdec_000032622.bmp 1.557974 bpg_test_000113065.bmp -0.039573 jpeg_pdec_000037361.bmp 1.930195 bpg_test_000146345.bmp 0.079876 jpeg_pdec_000040586.bmp 0.213736 jpeg_pdec_000044532.bmp 0.348876 jpeg_pdec_000048905.bmp 0.535742 jpeg_pdec_000055668.bmp 0.758401 jpeg_pdec_000064958.bmp 1.063934 jpeg_pdec_000080280.bmp 1.546419 jpeg_pdec_000112163.bmp 2.541619 jpeg_pdec_000223581.bmp

imdiff RMSE_RGB
Built Aug 30 2014 11:30:27
r:\PDI_1200.bmp
r:\PDI_1200.bmp_bpg
r:\PDI_1200.bmp_jpg_pdec

raw imdiff data : -2.50,-1.64,-0.85,-0.58,-0.43,-0.16,0.22,0.46,1.11,1.51|16.50,12.13,8.70,7.79,7.27,6.38,5.31,4.79,3.76,3.43|-1.77,-1.47,-1.24,-1.06,-0.79,-0.58,-0.38,-0.26,-0.12,0.02,0.21,0.78|17.05,15.36,14.18,13.28,12.07,11.20,10.38,9.91,9.36,8.80,8.09,6.38| fit imdiff data : -2.50,-1.64,-0.85,-0.58,-0.43,-0.16,0.22,0.46,1.11,1.51|4.84,5.34,5.83,5.98,6.07,6.24,6.46,6.58,6.84,6.93|-1.77,-1.47,-1.24,-1.06,-0.79,-0.58,-0.38,-0.26,-0.12,0.02,0.21,0.78|4.79,4.96,5.09,5.19,5.34,5.46,5.57,5.64,5.72,5.81,5.93,6.24|

 r:\PDI_1200.bmp_bpg r:\PDI_1200.bmp_jpg_pdec -2.502747 bpg_test_000020273.bmp -1.771229 jpeg_pdec_000033661.bmp -1.643425 bpg_test_000036779.bmp -1.468118 jpeg_pdec_000041531.bmp -0.845949 bpg_test_000063924.bmp -1.240727 jpeg_pdec_000048621.bmp -0.575801 bpg_test_000077088.bmp -1.059914 jpeg_pdec_000055113.bmp -0.434878 bpg_test_000084998.bmp -0.793755 jpeg_pdec_000066279.bmp -0.158926 bpg_test_000102915.bmp -0.579211 jpeg_pdec_000076906.bmp 0.219408 bpg_test_000133773.bmp -0.382314 jpeg_pdec_000088152.bmp 0.455907 bpg_test_000157602.bmp -0.261965 jpeg_pdec_000095821.bmp 1.113320 bpg_test_000248578.bmp -0.123163 jpeg_pdec_000105498.bmp 1.510508 bpg_test_000327362.bmp 0.019419 jpeg_pdec_000116457.bmp 0.212218 jpeg_pdec_000133108.bmp 0.777766 jpeg_pdec_000196993.bmp
imdiff MS_SSIM_IW_Y
Built Aug 30 2014 11:30:27
r:\PDI_1200.bmp
r:\PDI_1200.bmp_bpg
r:\PDI_1200.bmp_jpg_pdec

raw imdiff data : -2.50,-1.64,-0.85,-0.58,-0.43,-0.16,0.22,0.46,1.11,1.51|87.81,92.21,95.12,95.98,96.34,96.99,97.77,98.16,98.98,99.30|-1.77,-1.47,-1.24,-1.06,-0.79,-0.58,-0.38,-0.26,-0.12,0.02,0.21,0.78|89.84,91.44,92.46,93.22,94.23,94.94,95.53,95.87,96.23,96.60,97.04,98.08| fit imdiff data : -2.50,-1.64,-0.85,-0.58,-0.43,-0.16,0.22,0.46,1.11,1.51|4.80,5.43,5.97,6.17,6.26,6.44,6.69,6.84,7.24,7.45|-1.77,-1.47,-1.24,-1.06,-0.79,-0.58,-0.38,-0.26,-0.12,0.02,0.21,0.78|5.07,5.31,5.47,5.60,5.79,5.93,6.06,6.14,6.23,6.33,6.45,6.81|

 r:\PDI_1200.bmp_bpg r:\PDI_1200.bmp_jpg_pdec -2.502747 bpg_test_000020273.bmp -1.771229 jpeg_pdec_000033661.bmp -1.643425 bpg_test_000036779.bmp -1.468118 jpeg_pdec_000041531.bmp -0.845949 bpg_test_000063924.bmp -1.240727 jpeg_pdec_000048621.bmp -0.575801 bpg_test_000077088.bmp -1.059914 jpeg_pdec_000055113.bmp -0.434878 bpg_test_000084998.bmp -0.793755 jpeg_pdec_000066279.bmp -0.158926 bpg_test_000102915.bmp -0.579211 jpeg_pdec_000076906.bmp 0.219408 bpg_test_000133773.bmp -0.382314 jpeg_pdec_000088152.bmp 0.455907 bpg_test_000157602.bmp -0.261965 jpeg_pdec_000095821.bmp 1.113320 bpg_test_000248578.bmp -0.123163 jpeg_pdec_000105498.bmp 1.510508 bpg_test_000327362.bmp 0.019419 jpeg_pdec_000116457.bmp 0.212218 jpeg_pdec_000133108.bmp 0.777766 jpeg_pdec_000196993.bmp
imdiff Combo
Built Aug 30 2014 11:30:27
r:\PDI_1200.bmp
r:\PDI_1200.bmp_bpg
r:\PDI_1200.bmp_jpg_pdec

raw imdiff data : -2.50,-1.64,-0.85,-0.58,-0.43,-0.16,0.22,0.46,1.11,1.51|4.51,3.84,3.27,3.08,2.98,2.80,2.55,2.40,2.02,1.81|-1.77,-1.47,-1.24,-1.06,-0.79,-0.58,-0.38,-0.26,-0.12,0.02,0.21,0.78|4.25,3.96,3.76,3.60,3.39,3.23,3.09,3.00,2.91,2.80,2.67,2.32| fit imdiff data : -2.50,-1.64,-0.85,-0.58,-0.43,-0.16,0.22,0.46,1.11,1.51|4.52,5.21,5.80,5.99,6.09,6.28,6.53,6.68,7.07,7.27|-1.77,-1.47,-1.24,-1.06,-0.79,-0.58,-0.38,-0.26,-0.12,0.02,0.21,0.78|4.79,5.08,5.29,5.46,5.67,5.84,5.98,6.07,6.17,6.28,6.41,6.76|

 r:\PDI_1200.bmp_bpg r:\PDI_1200.bmp_jpg_pdec -2.502747 bpg_test_000020273.bmp -1.771229 jpeg_pdec_000033661.bmp -1.643425 bpg_test_000036779.bmp -1.468118 jpeg_pdec_000041531.bmp -0.845949 bpg_test_000063924.bmp -1.240727 jpeg_pdec_000048621.bmp -0.575801 bpg_test_000077088.bmp -1.059914 jpeg_pdec_000055113.bmp -0.434878 bpg_test_000084998.bmp -0.793755 jpeg_pdec_000066279.bmp -0.158926 bpg_test_000102915.bmp -0.579211 jpeg_pdec_000076906.bmp 0.219408 bpg_test_000133773.bmp -0.382314 jpeg_pdec_000088152.bmp 0.455907 bpg_test_000157602.bmp -0.261965 jpeg_pdec_000095821.bmp 1.113320 bpg_test_000248578.bmp -0.123163 jpeg_pdec_000105498.bmp 1.510508 bpg_test_000327362.bmp 0.019419 jpeg_pdec_000116457.bmp 0.212218 jpeg_pdec_000133108.bmp 0.777766 jpeg_pdec_000196993.bmp

moses :

imdiff RMSE_RGB
Built Aug 30 2014 11:30:27
r:\moses.bmp
r:\moses.bmp_bpg
r:\moses.bmp_jpg_pdec

raw imdiff data : -2.89,-1.97,-1.18,-0.90,-0.76,-0.48,-0.08,0.17,0.92,1.36|13.96,9.96,7.15,6.37,5.99,5.30,4.38,3.86,2.65,2.18|-2.10,-1.78,-1.56,-1.38,-1.12,-0.91,-0.72,-0.60,-0.46,-0.32,-0.11,0.49|13.20,11.59,10.38,9.65,8.60,7.86,7.18,6.83,6.44,6.06,5.58,4.40| fit imdiff data : -2.89,-1.97,-1.18,-0.90,-0.76,-0.48,-0.08,0.17,0.92,1.36|5.11,5.63,6.09,6.24,6.32,6.46,6.68,6.81,7.18,7.34|-2.10,-1.78,-1.56,-1.38,-1.12,-0.91,-0.72,-0.60,-0.46,-0.32,-0.11,0.49|5.20,5.41,5.57,5.68,5.84,5.97,6.09,6.15,6.23,6.30,6.40,6.68|

 r:\moses.bmp_bpg r:\moses.bmp_jpg_pdec -2.885454 bpg_test_000037087.bmp -2.096290 jpeg_pdec_000064089.bmp -1.966356 bpg_test_000070129.bmp -1.780268 jpeg_pdec_000079784.bmp -1.181866 bpg_test_000120796.bmp -1.558392 jpeg_pdec_000093048.bmp -0.902603 bpg_test_000146595.bmp -1.378797 jpeg_pdec_000105383.bmp -0.762217 bpg_test_000161577.bmp -1.122379 jpeg_pdec_000125881.bmp -0.481048 bpg_test_000196345.bmp -0.914342 jpeg_pdec_000145407.bmp -0.080435 bpg_test_000259189.bmp -0.719298 jpeg_pdec_000166456.bmp 0.174334 bpg_test_000309250.bmp -0.598617 jpeg_pdec_000180979.bmp 0.923476 bpg_test_000519785.bmp -0.461917 jpeg_pdec_000198966.bmp 1.358995 bpg_test_000702956.bmp -0.315356 jpeg_pdec_000220241.bmp -0.108696 jpeg_pdec_000254161.bmp 0.489100 jpeg_pdec_000384648.bmp
imdiff MS_SSIM_IW_Y
Built Aug 30 2014 11:30:27
r:\moses.bmp
r:\moses.bmp_bpg
r:\moses.bmp_jpg_pdec

raw imdiff data : -2.89,-1.97,-1.18,-0.90,-0.76,-0.48,-0.08,0.17,0.92,1.36|84.91,90.55,94.24,95.26,95.72,96.50,97.41,97.88,98.88,99.25|-2.10,-1.78,-1.56,-1.38,-1.12,-0.91,-0.72,-0.60,-0.46,-0.32,-0.11,0.49|87.54,89.76,91.23,92.22,93.54,94.44,95.20,95.62,96.02,96.43,96.93,98.06| fit imdiff data : -2.89,-1.97,-1.18,-0.90,-0.76,-0.48,-0.08,0.17,0.92,1.36|4.45,5.17,5.79,6.00,6.11,6.30,6.57,6.73,7.18,7.42|-2.10,-1.78,-1.56,-1.38,-1.12,-0.91,-0.72,-0.60,-0.46,-0.32,-0.11,0.49|4.76,5.06,5.27,5.43,5.66,5.83,5.99,6.08,6.18,6.29,6.42,6.80|

 r:\moses.bmp_bpg r:\moses.bmp_jpg_pdec -2.885454 bpg_test_000037087.bmp -2.096290 jpeg_pdec_000064089.bmp -1.966356 bpg_test_000070129.bmp -1.780268 jpeg_pdec_000079784.bmp -1.181866 bpg_test_000120796.bmp -1.558392 jpeg_pdec_000093048.bmp -0.902603 bpg_test_000146595.bmp -1.378797 jpeg_pdec_000105383.bmp -0.762217 bpg_test_000161577.bmp -1.122379 jpeg_pdec_000125881.bmp -0.481048 bpg_test_000196345.bmp -0.914342 jpeg_pdec_000145407.bmp -0.080435 bpg_test_000259189.bmp -0.719298 jpeg_pdec_000166456.bmp 0.174334 bpg_test_000309250.bmp -0.598617 jpeg_pdec_000180979.bmp 0.923476 bpg_test_000519785.bmp -0.461917 jpeg_pdec_000198966.bmp 1.358995 bpg_test_000702956.bmp -0.315356 jpeg_pdec_000220241.bmp -0.108696 jpeg_pdec_000254161.bmp 0.489100 jpeg_pdec_000384648.bmp
imdiff Combo
Built Aug 30 2014 11:30:27
r:\moses.bmp
r:\moses.bmp_bpg
r:\moses.bmp_jpg_pdec

raw imdiff data : -2.89,-1.97,-1.18,-0.90,-0.76,-0.48,-0.08,0.17,0.92,1.36|4.43,3.78,3.22,3.04,2.95,2.79,2.56,2.44,2.03,1.81|-2.10,-1.78,-1.56,-1.38,-1.12,-0.91,-0.72,-0.60,-0.46,-0.32,-0.11,0.49|4.19,3.89,3.65,3.51,3.29,3.13,2.98,2.90,2.81,2.72,2.59,2.26| fit imdiff data : -2.89,-1.97,-1.18,-0.90,-0.76,-0.48,-0.08,0.17,0.92,1.36|4.60,5.28,5.84,6.03,6.12,6.29,6.52,6.65,7.05,7.28|-2.10,-1.78,-1.56,-1.38,-1.12,-0.91,-0.72,-0.60,-0.46,-0.32,-0.11,0.49|4.85,5.16,5.41,5.56,5.78,5.94,6.09,6.18,6.27,6.36,6.49,6.83|

 r:\moses.bmp_bpg r:\moses.bmp_jpg_pdec -2.885454 bpg_test_000037087.bmp -2.096290 jpeg_pdec_000064089.bmp -1.966356 bpg_test_000070129.bmp -1.780268 jpeg_pdec_000079784.bmp -1.181866 bpg_test_000120796.bmp -1.558392 jpeg_pdec_000093048.bmp -0.902603 bpg_test_000146595.bmp -1.378797 jpeg_pdec_000105383.bmp -0.762217 bpg_test_000161577.bmp -1.122379 jpeg_pdec_000125881.bmp -0.481048 bpg_test_000196345.bmp -0.914342 jpeg_pdec_000145407.bmp -0.080435 bpg_test_000259189.bmp -0.719298 jpeg_pdec_000166456.bmp 0.174334 bpg_test_000309250.bmp -0.598617 jpeg_pdec_000180979.bmp 0.923476 bpg_test_000519785.bmp -0.461917 jpeg_pdec_000198966.bmp 1.358995 bpg_test_000702956.bmp -0.315356 jpeg_pdec_000220241.bmp -0.108696 jpeg_pdec_000254161.bmp 0.489100 jpeg_pdec_000384648.bmp