6.       RATE-OF-KEEPING-PRECISION THEORY FOR ECONOMIZATION OF COMMUNICATION

 

The Rate-of-keeping-precision theory is a reformed version of the rate-distortion theory in classical information theory . It is  a theory for datum compression like the rate-distortion theory as well as  a theory for matching an objective channel with subjective understanding or discrimination.

6.1       Rate-of-keeping-precision Theory¾¾Reform of Rate-distortion Theory

The rate-distortion theory  may be revised by replacing  the upper limit of  distortion D ( Equation (22) ) with the lower limit of generalized information to seek  the minimum of Shannon's mutual information. Let the lower limit of the generalized mutual information be G,  dij  be replaced with Iij =I(xi; yj ),  and D be replaced with G. Then the rate-distortion function R(D) will become function R(G), which is called rate-of-keeping-precision function and also has coding meaning. In the rate-distortion theory, D can only be positive,  whereas G can also be negative. We use keeping-precision instead of fidelity or limiting-distortion because the new criterion for communication quality emphasizes precision rather than correctness only.    

In a way similar to that in the classical information theory (Berger, 1971),  we can obtain the expression of function R(G) with parameter s:

,    (51)

where s=dR/dG, which means the necessary increment of objective information when subjective information is needed to increase (Figure 7 and 8), indicates the slope of function R(G); and

.

In linguistic communication, coding is expression and decoding is understanding. We often mention the price of some  goods by saying  ¡°more than thirty Dollars¡± instead of ¡°thirty-two Dollars and fifteen Cents¡±, mention someone's age by saying ¡°over thirty years old¡±  instead of ¡°thirty-three years and five months old¡±. This is because the more accurate a datum is, the more objective information it conveys and it is more  difficult to memorize it. However, by using inaccurate or fuzzy language, we can use less objective information to convey enough subjective information. In other words, we increase the relative amount of  subjective information by decreasing its absolute amount.

Function R(G)  indicates the compressing limit of objective information for certain subjective information.  For sensory  communication, such as image communication between  men and machines, function R(G) has much more practical meaning. For example, it tell us that it is not necessary for the TV to have a very high resolution when  the viewing distance cannot be  very short and hence the spatial discrimination of men¡¯s eyes is limited. By the same token, a digital black-and-white image with too many gray levels for a given visual discrimination is also not necessary (Figure 9 and 10). 

1.2     Rate-of-keeping-precision Function of a Binary Source with Symmetrical Similarity Relation

First we use a binary source with symmetrical similarity relation ( Q(Aj|xi)=Q(Ai|xj)i, j=0,1 ) as example to  examine the properties of  function  R(G). In this case, function R(G) can be directly  resolved.  Given the lower limit of subjective information

 (52)

we can have the solution of function R(G) as:

 

. (53)     

The deduction of function R(D) of binary memoryless source in  classical information theory (Berger, 1971) can be referred to for the deduction of function R(G).

Assume P(x0)=P(x1)=0.5, Q(X)=P(X), and the similarity relation

,  i, j=0, 1,

 then b=0.817, a= -2.069. The function R(G) is shown in Figure 7.


Figure 7  Rate-of-keeping-precision function of a binary source

In Figure 7, R(-0.626)=0 as shown by Point W1    which means  that if Y has nothing to do with X whereas one still believes Y is the correct response of X,  then the  average loss  of subjective information is at least 0.626 bits. Of course, if we know that Y has nothing to do with X, then  the set Aj  will be extremely fuzzy and there should be no information loss. A similar example in daily life is that if one believes a fortune teller¡¯s talk, one  would  be more ignorant  about facts and the information one has will be reduced. If one does not believe fortune telling, then one will have no information loss.

With G increasing from Point W1, R increases; and its maximum is R(0.817)=1.  When G decreasing from Point W1, R also increases. This tells us that for intentionally increasing  someone¡¯s information loss, we  would increase corresponding objective information In other words, lies  about real situations  are more harmful  than lies at random.

When s=1, equation Q(X|Aj )=P(X|yj ) for j=0,1  is tenable so that R=G=0.473. In this case, objective information (Shannon¡¯s information ) is equal to subjective  information. We call 

   (54)

information efficiency. Then G/R  is  the upper limit of  g , and reaches 1, its maximum, at Point W2.

 

6.3       Relationships between Rate-of-keeping-precision, Quantizing Grades of a Source, and Subjective Discrimination

Since P(yj|xi), li and P(yj ) for i, j=1, 2, ..., m  rely on each other, the exact  solution of function R(G) can be found only in special cases as shown in the last section. Generally, function R(G) can only be  resolved  by the iteration method. The  procedure for the  iteration method  is shown as follows ( Q(X)=P(X) is assumed):

1) Calculate I(xi; yj ) for i, j=1, 2, ..., m;

2) Give the initial value s0 and the final value s1 of s; for example, let s0= -10, s1=10,  s=s0.

3) Set the initial values R0 and P'ji of R and P(yj |xi) for i, j=1, 2, ..., m. For example, let R0=100, P'ji=P(xi).

4) Calculate P(yj ) and li for i, j =1, 2, ..., m.

5) Calculate the new values of Pji =P(yj |xi) for i, j=1, 2, ..., m and R=Is(X;Y).

6) Let r=|R-R0|/R0. If r>e ( such as , e=0.001), let R0=R,  P'ji=Pji for i, j=1, 2, ..., m  and return to 4); otherwise, calculate G and print s, R, and G.

7) Let s=s+step_length. If s<s1, then return to 3); otherwise, end the calculation.

 The general properties of  rate-of-keeping-precision may be examined by an image communication example. Assume we need to quantize an analog image into a numerical image and then to code, to transmit or store, and to decode the  digitized  image. The decoded image will be displayed for viewers; it has the same gray grades as the  digitized image .   For convenience,  only information conveyed by a pixel of the image is considered. The  problems to be resolved for optimizing image communication are   

1) For a given subjective visual discrimination and the lower limit of subjective information,  what is the shortest  average code length?

2) For given subjective discrimination, how many quantizing grades of gray are needed to have enough subjective information and higher information efficiency?

Let the gray level of quantized pixel be a source and the gray level is X=0, 1, ..., b  with probability distribution:

 

where C¡± is a normalizing constant, c=b/2, s0 = b/8. For simplicity, we assume that after decoding  the  pixel also has gray level  Z=0, 1, ..., b; the perception caused by zj is yj ; the visual discrimination space is uniform (Lu, 1989); discrimination function or confusion probability function is

 

where d is discrimination parameter. The smaller d is, the higher the discrimination is.

 


 

Figure 8 shows the relationship between d and R(G) for b=63. The figure indicates that

1)When R=0, the smaller d is, i.e. the higher the discrimination is, the smaller  G is. If decoded image has nothing to  do with original image whereas one still believes the former exactly reflects the latter, then one¡¯s subjective information will decrease; and the higher the discrimination, the greater the information loss. 

2) The higher the discrimination, the greater the possible  value of G matching R. In other words, to increase information efficiency, objective information should increase with the  discrimination increasing so that the objective matches the subjective.

3) When G approaches its possible maximum, dR/dG is quite great. This means that if we want to increase the absolute  value of subjective information, we have to sacrifice its relative value; this is often uneconomical. 

4) For given R, there is the optimal discrimination parameter d so that G=R. In fact, human brains are capable of  improving information efficiency by changing subjective discrimination. To determine dt in the  predictive coding as discussed in Section 4.2  is actually to modify subjective discrimination of a machine or a software..


Figure 9 Relationship between R(G) and b for certain subjective discrimination ( d=1/64)

Figure 9 shows the relationship  between R(G) and b when d=1/64. It is shown that when b becomes smaller, the possible maximum value of G and the matching value of G and R increase in an approximately direct ratio to the quantizing bit k=log2(b+1); when b becomes greater, its increase makes very little effect on G. In  other words, for a given discrimination, it is not good for the quantizing grade b to be too small or too great.   If  b is too small, there isn't enough subjective information.  If b is too  great, it is uneconomical.


Figure 10  Relationship between  matching value of R with G,  discrimination parameter d, and  quantizing bit k  when s=1

 Figure 10 shows the relationship between G, k, and d when s=1. It  shows that for given discrimination, there exists the optimal quantizing-bit k' so that information I, the matching  value of G and R, reaches the maximum. If k<k', the information I increases with k; if k>k', I no longer increases.