Digital Image Processing Second Edition Instructorzs Manual Rafael C. Gonzalez Richard E. Woods Prentice Hall Upper Saddle River, NJ 07458 www. prenhall. com/gonzalezwoods or www. imageprocessingbook. com ii Revision history 10 9 8 7 6 5 4 3 2 1 c Copyright °1992-2002 by Rafael C. Gonzalez and Richard E. Woods Preface This manual contains detailed solutions to all problems in Digital Image Processing, 2nd Edition. We also include a suggested set of guidelines for using the book, and discuss the use of computer projects designed to promote a deeper understanding of the subject matter.
The notation used throughout this manual corresponds to the notation used in the text. The decision of what material to cover in a course rests with the instructor, and it depends on the purpose of the course and the background of the students. We have found that the course outlines suggested here can be covered comfortably in the time frames indicated when the course is being taught in an electrical engineering or computer science curriculum. In each case, no prior exposure to image processing is assumed. We give suggested guidelines for one-semester courses at the senior and ®rst-year graduate levels.
It is possible to cover most of the book in a two-semester graduate sequence. The book was completely revised in this edition, with the purpose not only of updating the material, but just as important, making the book a better teaching aid. To this end, the instructor will ®nd the new organization to be much more -exible and better illustrated. Although the book is self contained, we recommend use of the companion web site, where the student will ®nd detailed solutions to the problems marked with a star in the text, review material, suggested projects, and images from the book.
One of the principal reasons for creating the web site was to free the instructor from having to prepare materials and handouts beyond what is required to teach from the book. Computer projects such as those described in the web site are an important part of a course on image processing. These projects give the student hands-on experience with algorithm implementation and reinforce the material covered in the classroom. The projects suggested at the web site can be implemented on almost any reasonablyequipped multi-user or personal computer having a hard copy output device. Introduction The purpose of this chapter is to present suggested guidelines for teaching material from this book at the senior and ®rst-year graduate level. We also discuss use of the book web site. Although the book is totally self-contained, the web site offers, among other things, complementary review material and computer projects that can be assigned in conjunction with classroom work. Detailed solutions to all problems in the book also are included in the remaining chapters of this manual. Teaching Features of the Book
Undergraduate programs that offer digital image processing typically limit coverage to one semester. Graduate programs vary, and can include one or two semesters of the material. In the following discussion we give general guidelines for a one-semester senior course, a one-semester graduate course, and a full-year course of study covering two semesters. We assume a 15-week program per semester with three lectures per week. In order to provide -exibility for exams and review sessions, the guidelines discussed in the following sections are based on forty, 50-minute lectures per semester.
The background assumed on the part of the student is senior-level preparation in mathematical analysis, matrix theory, probability, and computer programming. The suggested teaching guidelines are presented in terms of general objectives, and not as time schedules. There is so much variety in the way image processing material is taught that it makes little sense to attempt a breakdown of the material by class period. In particular, the organization of the present edition of the book is such that it makes it much easier than before to adopt signi®cantly different teaching strategies, depending on course objectives and student background.
For example, it is possible with the new organization to offer a course that emphasizes spatial techniques and covers little or no transform material. This is not something we recommend, but it is an option that often is attractive in programs that place little emphasis on the signal processing aspects of the ®eld and prefer to focus more on the implementation of spatial techniques. 2 Chapter 1 Introduction The companion web site www:prenhall:com=gonzalezwoods or www:imageprocessingbook:com is a valuable teaching aid, in the sense that it includes material that previously was covered in class.
In particular, the review material on probability, matrices, vectors, and linear systems, was prepared using the same notation as in the book, and is focused on areas that are directly relevant to discussions in the text. This allows the instructor to assign the material as independent reading, and spend no more than one total lecture period reviewing those subjects. Another major feature is the set of solutions to problems marked with a star in the book. These solutions are quite detailed, and were prepared with the idea of using them as teaching support.
The on-line availability of projects and digital images frees the instructor from having to prepare experiments, data, and handouts for students. The fact that most of the images in the book are available for downloading further enhances the value of the web site as a teaching resource. One Semester Senior Course A basic strategy in teaching a senior course is to focus on aspects of image processing in which both the inputs and outputs of those processes are images. In the scope of a senior course, this usually means the material contained in Chapters 1 through 6.
Depending on instructor preferences, wavelets (Chapter 7) usually are beyond the scope of coverage in a typical senior curriculum). However, we recommend covering at least some material on image compression (Chapter 8) as outlined below. We have found in more than two decades of teaching this material to seniors in electrical engineering, computer science, and other technical disciplines, that one of the keys to success is to spend at least one lecture on motivation and the equivalent of one lecture on review of background material, as the need arises.
The motivational material is provided in the numerous application areas discussed in Chapter 1. This chapter was totally rewritten with this objective in mind. Some of this material can be covered in class and the rest assigned as independent reading. Background review should cover probability theory (of one random variable) before histogram processing (Section 3. 3). A brief review of vectors and matrices may be required later, depending on the material covered. The review material included in the book web site was designed for just this purpose. One Semester Senior Course 3
Chapter 2 should be covered in its entirety. Some of the material (such as parts of Sections 2. 1 and 2. 3) can be assigned as independent reading, but a detailed explanation of Sections 2. 4 through 2. 6 is time well spent. Chapter 3 serves two principal purposes. It covers image enhancement (a topic of significant appeal to the beginning student) and it introduces a host of basic spatial processing tools used throughout the book. For a senior course, we recommend coverage of Sections 3. 2. 1 through 3. 2. 2u Section 3. 3. 1u Section 3. 4u Section 3. 5u Section 3. 6u Section 3. 7. 1, 3. . 2 (through Example 3. 11), and 3. 7. 3. Section 3. 8 can be assigned as independent reading, depending on time. Chapter 4 also discusses enhancement, but from a frequency-domain point of view. The instructor has signi®cant -exibility here. As mentioned earlier, it is possible to skip the chapter altogether, but this will typically preclude meaningful coverage of other areas based on the Fourier transform (such as ®ltering and restoration). The key in covering the frequency domain is to get to the convolution theorem and thus develop a tie between the frequency and spatial domains.
All this material is presented in very readable form in Section 4. 2. |Light} coverage of frequency-domain concepts can be based on discussing all the material through this section and then selecting a few simple ®ltering examples (say, low- and highpass ®ltering using Butterworth ®lters, as discussed in Sections 4. 3. 2 and 4. 4. 2). At the discretion of the instructor, additional material can include full coverage of Sections 4. 3 and 4. 4. It is seldom possible to go beyond this point in a senior course. Chapter 5 can be covered as a continuation of Chapter 4. Section 5. makes this an easy approach. Then, it is possible give the student a |-avor} of what restoration is (and still keep the discussion brief) by covering only Gaussian and impulse noise in Section 5. 2. 1, and a couple of spatial ®lters in Section 5. 3. This latter section is a frequent source of confusion to the student who, based on discussions earlier in the chapter, is expecting to see a more objective approach. It is worthwhile to emphasize at this point that spatial enhancement and restoration are the same thing when it comes to noise reduction by spatial ®ltering.
A good way to keep it brief and conclude coverage of restoration is to jump at this point to inverse ®ltering (which follows directly from the model in Section 5. 1) and show the problems with this approach. Then, with a brief explanation regarding the fact that much of restoration centers around the instabilities inherent in inverse ®ltering, it is possible to introduce the |interactive} form of the Wiener ®lter in Eq. (5. 8-3) and conclude the chapter with Examples 5. 12 and 5. 13. Chapter 6 on color image processing is a new feature of the book. Coverage of this 4
Chapter 1 Introduction chapter also can be brief at the senior level by focusing on enough material to give the student a foundation on the physics of color (Section 6. 1), two basic color models (RGB and CMY/CMYK), and then concluding with a brief coverage of pseudocolor processing (Section 6. 3). We typically conclude a senior course by covering some of the basic aspects of image compression (Chapter 8). Interest on this topic has increased signi®cantly as a result of the heavy use of images and graphics over the Internet, and students usually are easily motivated by the topic.
Minimum coverage of this material includes Sections 8. 1. 1 and 8. 1. 2, Section 8. 2, and Section 8. 4. 1. In this limited scope, it is worthwhile spending one-half of a lecture period ®lling in any gaps that may arise by skipping earlier parts of the chapter. One Semester Graduate Course (No Background in DIP) The main difference between a senior and a ®rst-year graduate course in which neither group has formal background in image processing is mostly in the scope of material covered, in the sense that we simply go faster in a graduate course, and feel much freer in assigning independent reading.
In addition to the material discussed in the previous section, we add the following material in a graduate course. Coverage of histogram matching (Section 3. 3. 2) is added. Sections 4. 3, 4. 4, and 4. 5 are covered in full. Section 4. 6 is touched upon brie-y regarding the fact that implementation of discrete Fourier transform techniques requires non-intuitive concepts such as function padding. The separability of the Fourier transform should be covered, and mention of the advantages of the FFT should be made. In Chapter 5 we add Sections 5. 5 through 5. 8. In Chapter 6 we add the HSI model (Section 6. . 2) , Section 6. 4, and Section 6. 6. A nice introduction to wavelets (Chapter 7) can be achieved by a combination of classroom discussions and independent reading. The minimum number of sections in that chapter are 7. 1, 7. 2, 7. 3, and 7. 5, with appropriate (but brief) mention of the existence of fast wavelet transforms. Finally, in Chapter 8 we add coverage of Sections 8. 3, 8. 4. 2, 8. 5. 1 (through Example 8. 16), Section 8. 5. 2 (through Example 8. 20) and Section 8. 5. 3. If additional time is available, a natural topic to cover next is morphological image processing (Chapter 9).
The material in this chapter begins a transition from methods whose inputs and outputs are images to methods in which the inputs are images, but the outputs are attributes about those images, in the sense de®ned in Section 1. 1. We One Semester Graduate Course (with Background in DIP) 5 recommend coverage of Sections 9. 1 through 9. 4, and some of the algorithms in Section 9. 5. One Semester Graduate Course (with Background in DIP) Some programs have an undergraduate course in image processing as a prerequisite to a graduate course on the subject.
In this case, it is possible to cover material from the ®rst eleven chapters of the book. Using the undergraduate guidelines described above, we add the following material to form a teaching outline for a one semester graduate course that has that undergraduate material as prerequisite. Given that students have the appropriate background on the subject, independent reading assignments can be used to control the schedule. Coverage of histogram matching (Section 3. 3. 2) is added. Sections 4,3, 4. 4, 4. 5, and 4. 6 are added. This strengthens the studentzs background in frequency-domain concepts.
A more extensive coverage of Chapter 5 is possible by adding sections 5. 2. 3, 5. 3. 3, 5. 4. 3, 5. 5, 5. 6, and 5. 8. In Chapter 6 we add full-color image processing (Sections 6. 4 through 6. 7). Chapters 7 and 8 are covered as in the previous section. As noted in the previous section, Chapter 9 begins a transition from methods whose inputs and outputs are images to methods in which the inputs are images, but the outputs are attributes about those images. As a minimum, we recommend coverage of binary morphology: Sections 9. 1 through 9. 4, and some of the algorithms in Section 9. . Mention should be made about possible extensions to gray-scale images, but coverage of this material may not be possible, depending on the schedule. In Chapter 10, we recommend Sections 10. 1, 10. 2. 1 and 10. 2. 2, 10. 3. 1 through 10. 3. 4, 10. 4, and 10. 5. In Chapter 11we typically cover Sections 11. 1 through 11. 4. Two Semester Graduate Course (No Background in DIP) A full-year graduate course consists of the material covered in the one semester undergraduate course, the material outlined in the previous section, and Sections 12. 1, 12. 2, 12. 3. 1, and 12. 3. 2. Projects
One of the most interesting aspects of a course in digital image processing is the pictorial 6 Chapter 1 Introduction nature of the subject. It has been our experience that students truly enjoy and bene®t from judicious use of computer projects to complement the material covered in class. Since computer projects are in addition to course work and homework assignments, we try to keep the formal project reporting as brief as possible. In order to facilitate grading, we try to achieve uniformity in the way project reports are prepared. A useful report format is as follows: Page 1: Cover page. Project title ? Project number ? Course number ? Studentzs name ? Date due ? Date handed in ? Abstract (not to exceed 1/2 page) Page 2: One to two pages (max) of technical discussion. Page 3 (or 4): Discussion of results. One to two pages (max). Results: Image results (printed typically on a laser or inkjet printer). All images must contain a number and title referred to in the discussion of results. Appendix: Program listings, focused on any original code prepared by the student. For brevity, functions and routines provided to the student are referred to by name, but the code is not included.
Layout: The entire report must be on a standard sheet size (e. g. , 8:5 ? 11 inches), stapled with three or more staples on the left margin to form a booklet, or bound using clear plastic standard binding products. Project resources available in the book web site include a sample project, a list of suggested projects from which the instructor can select, book and other images, and MATLAB functions. Instructors who do not wish to use MATLAB will ®nd additional software suggestions in the Support/Software section of the web site. 2 Problem Solutions Problem 2. 1
The diameter, x, of the retinal image corresponding to the dot is obtained from similar triangles, as shown in Fig. P2. 1. That is, (d=2) (x=2) = 0:2 0:014 which gives x = 0:07d. From the discussion in Section 2. 1. 1, and taking some liberties of interpretation, we can think of the fovea as a square sensor array having on the order of 337,000 elements, which translates into an array of size 580 ? 580 elements. Assuming equal spacing between elements, this gives 580 elements and 579 spaces on a line 1. 5 mm long. The size of each element and each space is then s = [(1:5mm)=1; 159] = 1:3 ? 0? 6 m. If the size (on the fovea) of the imaged dot is less than the size of a single resolution element, we assume that the dot will be invisible to the eye. In other words, the eye will not detect a dot if its diameter, d, is such that 0:07(d) < 1:3 ? 10? 6 m, or d < 18:6 ? 10? 6 m. Figure P2. 1 8 Chapter 2 Problem Solutions Problem 2. 2 Brightness adaptation. Problem 2. 3 ? = c=v = 2:998 ? 108 (m/s)=60(1/s) = 4:99 ? 106 m = 5000 Km. Problem 2. 4 (a) From the discussion on the electromagnetic spectrum in Section 2. , the source of the illumination required to see an object must have wavelength the same size or smaller than the object. Because interest lies only on the boundary shape and not on other spectral characteristics of the specimens, a single illumination source in the far ultraviolet (wavelength of . 001 microns or less) will be able to detect all objects. A far-ultraviolet camera sensor would be needed to image the specimens. (b) No answer required since the answer to (a) is af®rmative. Problem 2. 5 From the geometry of Fig. 2. 3, 7mm=35mm= z=500mm, or z = 100 mm.
So the target size is 100 mm on the side. We have a total of 1024 elements per line, so the resolution of 1 line is 1024=100 = 10 elements/mm. For line pairs we divide by 2, giving an answer of 5 lp/mm. Problem 2. 6 One possible solution is to equip a monochrome camera with a mechanical device that sequentially places a red, a green, and a blue pass ®lter in front of the lens. The strongest camera response determines the color. If all three responses are approximately equal, the object is white. A faster system would utilize three different cameras, each equipped with an individual ®lter.
The analysis would be then based on polling the response of each camera. This system would be a little more expensive, but it would be faster and more reliable. Note that both solutions assume that the ®eld of view of the camera(s) is such that it is completely ®lled by a uniform color [i. e. , the camera(s) is(are) focused on Problem 2. 7 9 a part of the vehicle where only its color is seen. Otherwise further analysis would be required to isolate the region of uniform color, which is all that is of interest in solving this problem]. Problem 2. 7
The image in question is given by f (x; y ) = i(x; y )r(x; y ) = 255e? [(x? x0 ) 2 = 255e? [(x? x0 ) 2 +(y ? y0 )2 ] (1:0) 2 +(y ? y0 ) ] A cross section of the image is shown in Fig. P2. 7(a). If the intensity is quantized using m bits, then we have the situation shown in Fig. P2. 7(b), where 4G = (255 + 1)=2m . Since an abrupt change of 8 gray levels is assumed to be detectable by the eye, it follows that 4G = 8 = 256=2m, or m = 5. In other words, 32, or fewer, gray levels will produce visible false contouring. Figure P2. 7 10 Chapter 2 Problem Solutions Problem 2. 8
The use of two bits (m = 2) of intensity resolution produces four gray levels in the range 0 to 255. One way to subdivide this range is to let all levels between 0 and 63 be coded as 63, all levels between 64 and 127 be coded as 127, and so on. The image resulting from this type of subdivision is shown in Fig. P2. 8. Of course, there are other ways to subdivide the range [0; 255] into four bands. Figure P2. 8 Problem 2. 9 (a) The total amount of data (including the start and stop bit) in an 8-bit, 1024 ? 1024 image, is (1024)2 ? [8 + 2] bits. The total time required to transmit this image over a At 56K baud link is (1024)2 ? 8 + 2]=56000 = 187:25 sec or about 3. 1 min. (b) At 750K this time goes down to about 14 sec. Problem 2. 10 The width-to-height ratio is 16/9 and the resolution in the vertical direction is 1125 lines (or, what is the same thing, 1125 pixels in the vertical direction). It is given that the Problem 2. 11 11 resolution in the horizontal direction is in the 16/9 proportion, so the resolution in the vertical direction is (1125) ? (16=9) = 2000 pixels per line. The system |paints} a full 1125 ? 2000, 8-bit image every 1/30 sec for each of the red, green, and blue component images.
There are 7200 sec in two hours, so the total digital data generated in this time interval is (1125)(2000)(8)(30)(3)(7200) = 1:166 ? 1013 bits, or 1:458 ? 1012 bytes (i. e. , about 1. 5 terrabytes). These ®gures show why image data compression (Chapter 8) is so important. Problem 2. 11 Let p and q be as shown in Fig. P2. 11. Then, (a) S1 and S2 are not 4-connected because q is not in the set N4 (p)u (b) S1 and S2 are 8-connected because q is in the set N8 (p)u (c) S1 and S2 are m-connected because (i) q is in ND (p), and (ii) the set N4 (p) N4 (q ) is empty. Figure P2. 11
Problem 2. 12 The solution to this problem consists of de®ning all possible neighborhood shapes to go from a diagonal segment to a corresponding 4-connected segment, as shown in Fig. P2. 12. The algorithm then simply looks for the appropriate match every time a diagonal segment is encountered in the boundary. Problem 2. 13 The solution to this problem is the same as for Problem 2. 12 because converting from an m-connected path to a 4-connected path simply involves detecting diagonal segments and converting them to the appropriate 4-connected segment. 12 Chapter 2 Problem Solutions
Figure P2. 12 Problem 2. 14 A region R of an image is composed of a set of connected points in the image. The boundary of a region is the set of points that have one or more neighbors that are not in R. Because boundary points also are part of R, it follows that a point on the boundary has at least one neighbor in R and at least one neighbor not in R. (If the point in the boundary did not have a neighbor in R, the point would be disconnected from R, which violates the de®nition of points in a region. ) Since all points in R are part of a connected component (see Section 2. 5. ), all points in the boundary are also connected and a path (entirely in R) exists between any two points on the boundary. Thus the boundary forms a closed path. Problem 2. 15 (a) When V = f0; 1g, 4-path does not exist between p and q because it is impossible to get from p to q by traveling along points that are both 4-adjacent and also have values from V . Figure P2. 15(a) shows this conditionu it is not possible to get to q . The shortest 8-path is shown in Fig. P2. 15(b)u its length is 4. The length of the shortest m- path (shown dashed) is 5. Both of these shortest paths are unique in this case. b) One Problem 2. 16 13 possibility for the shortest 4-path when V = f1; 2g is shown in Fig. P2. 15(c)u its length is 6. It is easily veri®ed that another 4-path of the same length exists between p and q . One possibility for the shortest 8-path (it is not unique) is shown in Fig. P2. 15(d)u its length is 4. The length of a shortest m-path (shown dashed) is 6. This path is not unique. Figure P2. 15 Problem 2. 16 (a) A shortest 4-path between a point p with coordinates (x; y ) and a point q with coordinates (s; t) is shown in Fig. P2. 16, where the assumption is that all points along the path are from V .
The length of the segments of the path are jx ? sj and jy ? tj, respectively. The total path length is jx ? sj + jy ? tj, which we recognize as the de®nition of the D4 distance, as given in Eq. (2. 5-16). (Recall that this distance is independent of any paths that may exist between the points. ) The D4 distance obviously is equal to the length of the shortest 4-path when the length of the path is jx ? sj + jy ? tj. This occurs whenever we can get from p to q by following a path whose elements (1) are from V; and (2) are arranged in such a way that we can traverse the path from p to q by making turns in at most two directions (e. . , right and up). (b) The path may of may not be unique, depending on V and the values of the points along the way. 14 Chapter 2 Problem Solutions Figure P2. 16 Problem 2. 17 (a) The D8 distance between p and q (see Fig. P2. 16) is de®ned as max (jx ? sj ; jy ? tj). Recall that the D8 distance (unlike the Euclidean distance) counts diagonal segments the same as horizontal and vertical segments, and, as in the case of the D4 distance, is independent of whether or not a path exists between p and q . As in the previous problem, the shortest 8-path is qual to the D8 distance when the path length is max (jx ? sj ; jy ? tj). This occurs when we can get from p to q by following a path whose elements (1) are from V , and (2) are arranged in such a way that we can traverse the path from p to q by by traveling diagonally in only one direction and, whenever diagonal travel is not possible, by making turns in the horizontal or vertical (but not both) direction. (b) The path may of may not be unique, depending on V and the values of the points along the way.
Problem 2. 18 With reference to Eq. (2. 6-1), let H denote the neighborhood sum operator, let S1 and S2 denote two different small subimage areas of the same size, and let S1 + S2 denote the corresponding pixel-by-pixel sum of the elements in S1 and S2 , as explained in Section 2. 5. 4. Note that the size of the neighborhood (i. e. , number of pixels) is not changed by this pixel-by-pixel sum. The operator H computes the sum of pixel values is a given neighborhood.
Then, H (aS1 + bS2 ) means: (1) multiplying the pixels in each of the subimage areas by the constants shown, (2) adding the pixel-by-pixel values from S1 and S2 (which produces a single subimage area), and (3) computing the sum of the values of all the pixels in that single subimage area. Let ap1 and bp2 denote two arbitrary (but Problem 2. 19 15 corresponding) pixels from aS1 + bS2 . Then we can write X H (aS1 + bS2 ) = ap1 + bp2 p1 2S1 and p2 2S2 = X ap1 + p1 2S1 =a X p1 2S1 X bp2 p2 2S2 p1 + b X p2 p2 2S2 = aH (S1 ) + bH (S2 ) which, according to Eq. (2. -1), indicates that H is a linear operator. Problem 2. 19 The median, ? , of a set of numbers is such that half the values in the set are below ? and the other half are above it. A simple example will suf®ce to show that Eq. (2. 6-1) is violated by the median operator. Let S1 = f1; ? 2; 3g, S2 = f4; 5; 6g, and a = b = 1. In this case H is the median operator. We then have H (S1 + S2 ) =medianf5; 3; 9g = 5, where it is understood that S1 + S2 is the element-by-corresponding-element sum of S1 and S2 . Next, we compute H (S1 ) = medianf1; ? 2; 3g = 1 and H (S2 ) = medianf4; 5; 6g = 5.
Then, since H (aS1 + bS2 ) 6= aH (S1 ) + bH (S2 ), it follows that Eq. (2. 6-1) is violated and the median is a nonlinear operator. Problem 2. 20 The geometry of the chips is shown in Fig. P2. 20(a). From Fig. P2. 20(b) and the geometry in Fig. 2. 3, we know that ? ? 80 ?x = ?? z where ? x is the side dimension of the image (assumed square since the viewing screen is square) impinging on the image plane, and the 80 mm refers to the size of the viewing screen, as described in the problem statement. The most inexpensive solution will result from using a camera of resolution 512 ? 512. Based on the information in Fig.
P2. 20(a), a CCD chip with this resolution will be of size (16? ) ? (512) = 8 mm on each side. Substituting ? x = 8 mm in the above equation gives z = 9? as the relationship between the distance z and the focal length of the lens, where a minus sign was ignored because it is just a coordinate inversion. If a 25 mm lens is used, the front of the lens will have to be located at approximately 225 mm from the viewing screen so that the size of the 16 Chapter 2 Problem Solutions image of the screen projected onto the CCD image plane does not exceed the 8 mm size of the CCD chip for the 512 ? 12 camera. This value for z is reasonable, but it is obvious that any of the other given lens sizes would work alsou the camera would just have to be positioned further away. Figure P2. 20 Assuming a 25 mm lens, the next issue is to determine if the smallest defect will be imaged on, at least, a 2 ? 2 pixel area, as required by the speci®cation. It is given that the defects are circular, with the smallest defect having a diameter of 0. 8 mm. So, all that needs to be done is to determine if the image of a circle of diameter 0. 8 mm or greater will, at least, be of size 2 ? pixels on the CCD imaging plane. This can be determined by using the same model as in Fig. P2. 20(b) with the 80 mm replaced by 0. 8 mm. Using ? = 25 mm and z = 225 mm in the above equation yields ? x = 100 ?. In other words, a circular defect of diameter 0. 8 mm will be imaged as a circle with a diameter of 100 ? on the CCD chip of a 512 ? 512 camera equipped with a 25 mm lens and which views the defect at a distance of 225 mm. If, in order for a CCD receptor to be activated, its area has to be excited in its entirety, then, it can be seen from Fig. P2. 20(a) that to guarantee that a 2 ? array of such receptors will be activated, a circular area of diameter no less than (6)(8) = 48 ? has to be imaged onto the CCD chip. The smallest defect is imaged as a circle with diameter of 100 ? , which is well above the 48 ? minimum requirement. Thus, it is concluded that a CCD camera of resolution 512 ? 512 pixels, using a 25 mm lens and imaging the viewing screen at a distance of 225 mm, is suf®cient to solve the problem posed by the plant manager. 3 Problem Solutions Problem 3. 1 2 (a) General form: s = T (r) = Ae? Kr . For the condition shown in the problem ®gure, 2 Ae? KL0 = A=2. Solving for K yields KL2 0 K = ln(0:5) = 0:693=L2 : 0 Then, ? 0:693 r 2 2 s = T (r) = Ae L0 : 2 (b) General form: s = T (r) = B (1 ? e? Kr ). For the condition shown in the problem 2 ®gure, B (1 ? e? KL0 ) = B=2. The solution for K is the same as in (a), so ? 0:693 r2 2 s = T (r) = B(1 ? e L 0 ) 2 (c) General form: s = T (r) = (D ? C )(1 ? e? Kr ) + C . Problem 3. 2 (a) s = T (r) = 1 . 1+(m=r)E (b) See Fig. P3. 2. (c) We want the value of s to be 0 for r < m, and s to be 1 for values of r > m. When r = m, s = 1=2. But, because the values of r are integers, the behavior we want is 8 > 0:0 when r · m ? 1 < = T ( r) = 0:5 when r = m > : 1:0 when r ? m + 1: The question in the problem statement is to ®nd the smallest value of E that will make the threshold behave as in the equation above. When r = m, we see from (a) that s = 0:5, regardless of the value of E . If C is the smallest positive number representable 18 Chapter 3 Problem Solutions in the computer, and keeping in mind that s is positive, then any value of s less than C=2 will be called 0 by the computer. To ®nd out the smallest value of E for which this happens, simply solve the following equation for E , using the given value m = 128: 1 lt; C=2: 1 + [m=(m ? 1)]E Because the function is symmetric about m, the resulting value of E will yield s = 1 for r ? m + 1. Figure P3. 2 Problem 3. 3 The transformations required to produce the individual bit planes are nothing more than mappings of the truth table for eight binary variables. In this truth table, the values of the 7th bit are 0 for byte values 0 to 127, and 1 for byte values 128 to 255, thus giving the transformation mentioned in the problem statement. Note that the given transformed values of either 0 or 255 simply indicate a binary image for the 7th bit plane.
Any other two values would have been equally valid, though less conventional. Problem 3. 4 19 Continuing with the truth table concept, the transformation required to produce an image of the 6th bit plane outputs a 0 for byte values in the range [0, 63], a 1 for byte values in the range [64, 127], a 0 for byte values in the range [128, 191], and a 1 for byte values in the range [192, 255]. Similarly, the transformation for the 5th bit plane alternates between eight ranges of byte values, the transformation for the 4th bit plane alternates between 16 ranges, and so on.
Finally, the output of the transformation for the 0th bit plane alternates between 0 and 255 depending as the byte values are even or odd. Thus, this transformation alternates between 128 byte value ranges, which explains why an image of the 0th bit plane is usually the busiest looking of all the bit plane images. Problem 3. 4 (a) The number of pixels having different gray level values would decrease, thus causing the number of components in the histogram to decrease. Since the number of pixels would not change, this would cause the height some of the remaining histogram peaks to increase in general.
Typically, less variability in gray level values will reduce contrast. (b) The most visible effect would be signi®cant darkening of the image. For example, dropping the highest bit would limit to 127 the brightest level in an 8-bit image. Since the number of pixels would remain constant, the height of some of the histogram peaks would increase. The general shape of the histogram would now be taller and narrower, with no histogram components being located past 127. Problem 3. 5 All that histogram equalization does is remap histogram components on the intensity scale.
To obtain a uniform (-at) histogram would require in general that pixel intensities be actually redistributed so that there are L groups of n=L pixels with the same intensity, where L is the number of allowed discrete intensity levels and n is the total number of pixels in the input image. The histogram equalization method has no provisions for this type of (arti®cial) redistribution process. Problem 3. 6 Let n be the total number of pixels and let nrj be the number of pixels in the input image 20 Chapter 3 Problem Solutions with intensity value rj . Then, the histogram equalization transformation is k X 1X sk = T (rk ) = nrj =n = nr : n j =0 j j =0 Since every pixel (and no others) with value rk is mapped to value sk , it follows that nsk = nrk . A second pass of histogram equalization would produce values vk according to the transformation k 1X ns : vk = T (sk ) = n j =0 j But, nsj = nrj , so k vk = T (sk ) = 1X nr = sk n j =0 j which shows that a second pass of histogram equalization would yield the same result as the ®rst pass. We have assumed negligible round-off errors. Problem 3. 7 The general histogram equalization transformation function is Zr s = T (r) = pr (w) dw: 0
There are two important points to which the student must show awareness in answering this problem. First, this equation assumes only positive values for r. However, the Gaussian density extends in general from ? 1 to 1. Recognition of this fact is important. Once recognized, the student can approach this dif®culty in several ways. One good answer is to make some assumption, such as the standard deviation being small enough so that the area of the curve under pr (r) for negative values of r is negligible. Another is to scale up the values until the area under the negative tail is negligible.
The second major point is to recognize is that the transformation function itself, Zr (w ? m)2 1 e? 2? 2 dw s = T (r) = p 2?? 0 has no closed-form solution. This is the cumulative distribution function of the Gaussian density, which is either integrated numerically, or its values are looked up in a table. A third, less important point, that the student should address is the high-end values of r. Again, the Gaussian PDF extends to +1. One possibility here is to make the same Problem 3. 8 21 assumption as above regarding the standard deviation.
Another is to divide by a large enough value so that the area under the positive tail past that point is negligible (this scaling reduces the standard deviation). Another principal approach the student can take is to work with histograms, in which case the transformation function would be in the form of a summation. The issue of negative and high positive values must still be addressed, and the possible answers suggested above regarding these issues still apply. The student needs to indicate that the histogram is obtained by sampling the continuous function, so some mention should be made regarding the number of samples (bits) used.
The most likely answer is 8 bits, in which case the student needs to address the scaling of the function so that the range is [0; 255]. Problem 3. 8 We are interested in just one example in order to satisfy the statement of the problem. Consider the probability density function shown in Fig. P3. 8(a). A plot of the transformation T (r) in Eq. (3. 3-4) using this particular density function is shown in Fig. P3. 8(b). Because pr (r) is a probability density function we know from the discussion in Section 3. 3. 1 that the transformation T (r) satis®es conditions (a) and (b) stated in that section. However, we see from Fig. P3. (b) that the inverse transformation from s back to r is not single valued, as there are an in®nite number of possible mappings from s = 1=2 back to r. It is important to note that the reason the inverse transformation function turned out not to be single valued is the gap in pr (r) in the interval [1=4; 3=4]. Problem 3. 9 (a) We need to show that the transformation function in Eq. (3. 3-8) is monotonic, singlevalued, and that its values are in the range [0, 1]. From Eq. (3. 3-8), k X sk = T (rk ) = pr (rj ) j =0 = k X nj n j =0 k = 0; 1; : : : ; L ? 1: Because all the pr (rj ) are positive, it follows that T (rk ) is monotonic.
Because all the pr (rj ) are ®nite, and the limit of summation is ®nite, it follows that T (rk ) is of ®nite 22 Chapter 3 Problem Solutions slope and thus us a single-valued function. Finally, since the sum of all the pr (rj ) is 1, it follows that 0 · sk · 1: Figure P3. 8. (b) From the discussion in Problem 3. 8, it follows that if an image has missing gray levels the histogram equalization transformation function given above will be constant in the interval of the missing gray levels. Thus, in theory, the inverse mapping will not be single-valued in the discrete case either.
In practice, assuming that we wanted to perform the inverse transformation, this is not important for the following reason: Assume that no gray-level values exist in the open interval (a; b), so that ra is the last gray level before the empty gray-level band begins and rb is the ®rst gray level right after the empty band ends. The corresponding mapped gray levels are sa and sb . The fact that no gray levels r exist in interval (a; b) means that no gray levels will exist between sa and sb either, and, therefore, there will be no levels s to map back to r in the bands where the multi-valued inverse function would present problems.
Thus, in practice, the issue of the inverse not being single-valued is not an issue since it would not be needed. Note that mapping back from sa and sb presents no problems, since T (ra ) and T (rb ) (and thus their inverses) are different. A similar discussion applies if there are more than one band empty of gray levels. Problem 3. 10 23 (c) If none of the gray levels rk ; k = 1; 2; : : : ; L ? 1; are 0, then T (rk ) will be strictly monotonic. This implies that the inverse transformation will be of ®nite slope and this will be single-valued. Problem 3. 10 First, we obtain the histogram equalization transformation:
Zr Zr s = T (r) = pr (w) dw = (? 2w + 2) dw = ? r2 + 2r: 0 0 Next we ®nd v = G(z ) = Zz 0 Finally, pz (w) dw = Zz 2w dw = z 2 : 0 p z = G? 1 (v ) = § v: p But only positive gray levels are allowed, so z = v. Then, we replace v with s, which in turn is ? r2 + 2r, and we have p z = ? r2 + 2r: Problem 3. 11 The value of the histogram component corresponding to the kth intensity level in a neighborhood is nk pr (rk ) = n for k = 1; 2; : : : ; K ? 1;where nk is the number of pixels having gray level value rk , n is the total number of pixels in the neighborhood, and K is the total number of possible gray levels.
Suppose that the neighborhood is moved one pixel to the right. This deletes the leftmost column and introduces a new column on the right. The updated histogram then becomes 1 p0 (rk ) = [nk ? nLk + nRk ] r n for k = 0; 1; : : : ; K ? 1, where nLk is the number of occurrences of level rk on the left column and nRk is the similar quantity on the right column. The preceding equation can 24 Chapter 3 Problem Solutions be written also as p0 (rk ) = pr (rk ) + r 1 [nRk ? nLk ] n for k = 0; 1; : : : ; K ? 1: The same concept applies to other modes of neighborhood motion: 1 p0 (rk ) = pr (rk ) + [bk ? ak ] r for k = 0; 1; : : : ; K ? 1, where ak is the number of pixels with value rk in the neighborhood area deleted by the move, and bk is the corresponding number introduced by the move. Problem 3. 12 The purpose of this simple problem is to make the student think of the meaning of histograms and arrive at the conclusion that histograms carry no information about spatial properties of images. Thus, the only time that the histogram of the images formed by the operations shown in the problem statement can be determined in terms of the original histograms is when one or both of the images is (are) constant.
In (d) we have the additional requirement that none of the pixels of g(x; y ) can be 0. Assume for convenience that the histograms are not normalized, so that, for example, hf (rk ) is the number of pixels in f (x; y ) having gray level rk , assume that all the pixels in g(x; y ) have constant value c. The pixels of both images are assumed to be positive. Finally, let uk denote the gray levels of the pixels of the images formed by any of the arithmetic operations given in the problem statement. Under the preceding set of conditions, the histograms are determined as follows: a) The histogram hsum (uk ) of the sum is obtained by letting uk = rk +c; and hsum (uk ) = hf (rk ) for all k. In other words, the values (height) of the components of hsum are the same as the components of hf , but their locations on the gray axis are shifted right by an amount c. (b) Similarly, the histogram hdiff (uk ) of the difference has the same components as hf but their locations are moved left by an amount c as a result of the subtraction operation. (c) Following the same reasoning, the values (heights) of the components of histogram hprod (uk ) of the product are the same as hf , but their locations are at uk = c ? k . Note that while the spacing between components of the resulting histograms in (a) and (b) was not affected, the spacing between components of hprod (uk ) will be spread out by an amount c. Problem 3. 13 25 (d) Finally, assuming that c 6= 0, the components of hdiv (uk ) are the same as those of hf , but their locations will be at uk = rk =c. Thus, the spacing between components of hdiv (uk ) will be compressed by an amount equal to 1=c. The preceding solutions are applicable if image f (x; y ) also is constant.
In this case the four histograms just discussed would each have only one component. Their location would be affected as described (a) through (c). Problem 3. 13 Using 10 bits (with one bit being the sign bit) allows numbers in the range ? 511 to 511. The process of repeated subtractions can be expressed as K X dK (x; y ) = a(x; y ) ? b(x; y ) k=1 = a(x; y ) ? K ? b(x; y ) where K is the largest value such that dK (x; y ) does not exceed ? 511 at any coordinates (x; y ), at which time the subtraction process stops. We know nothing about the images, only that both have values ranging from 0 to 255.
Therefore, all we can determine are the maximum and minimum number of times that the subtraction can be carried out and the possible range of gray-level values in each of these two situations. Because it is given that g(x; y ) has at least one pixel valued 255, the maximum value that K can have before the subtraction exceeds ? 511 is 3. This condition occurs when, at some pair of coordinates (s; t), a(s; t) = b(s; t) = 255. In this case, the possible range of values in the difference image is -510 to 255. The latter condition can occur if, at some pair of coordinates (i; j ), a(i; j ) = 255 and b(i; j ) = 0.
The minimum value that K will have is 2, which occurs when, at some pair of coordinates, a(s; t) = 0 and b(s; t) = 255. In this case, the possible range of values in the difference image again is ? 510 to 255. The latter condition can occur if, at some pair of coordinates (i; j ), a(i; j ) = 255 and b(i; j ) = 0. Problem 3. 14 Let g(x; y ) denote the golden image, and let f (x; y ) denote any input image acquired during routine operation of the system. Change detection via subtraction is based on computing the simple difference d(x; y ) = g(x; y) ? f (x; y ). The resulting image 26 Chapter 3 Problem Solutions (x; y) can be used in two fundamental ways for change detection. One way is use a pixel-by-pixel analysis. In this case we say that f (x; y ) is }close enough} to the golden image if all the pixels in d(x; y ) fall within a speci®ed threshold band [Tmin ; Tmax ] where Tmin is negative and Tmax is positive. Usually, the same value of threshold is used for both negative and positive differences, in which case we have a band [? T; T ] in which all pixels of d(x; y ) must fall in order for f (x; y ) to be declared acceptable. The second major approach is simply to sum all the pixels in jd(x; y)j and compare the sum against a threshold S .
Note that the absolute value needs to be used to avoid errors cancelling out. This is a much cruder test, so we will concentrate on the ®rst approach. There are three fundamental factors that need tight control for difference-based inspection to work: (1) proper registration, (2) controlled illumination, and (3) noise levels that are low enough so that difference values are not affected appreciably by variations due to noise. The ®rst condition basically addresses the requirement that comparisons be made between corresponding pixels.
Two images can be identical, but if they are displaced with respect to each other, comparing the differences between them makes no sense. Often, special markings are manufactured into the product for mechanical or image-based alignment Controlled illumination (note that |illumination} is not limited to visible light) obviously is important because changes in illumination can affect dramatically the values in a difference image. One approach often used in conjunction with illumination control is intensity scaling based on actual conditions.
For example, the products could have one or more small patches of a tightly controlled color, and the intensity (and perhaps even color) of each pixels in the entire image would be modi®ed based on the actual versus expected intensity and/or color of the patches in the image being processed. Finally, the noise content of a difference image needs to be low enough so that it does not materially affect comparisons between the golden and input images. Good signal strength goes a long way toward reducing the effects of noise. Another (sometimes complementary) approach is to implement image processing techniques (e. . , image averaging) to reduce noise. Obviously there are a number if variations of the basic theme just described. For example, additional intelligence in the form of tests that are more sophisticated than pixel-bypixel threshold comparisons can be implemented. A technique often used in this regard is to subdivide the golden image into different regions and perform different (usually more than one) tests in each of the regions, based on expected region content. Problem 3. 15 27 Problem 3. 15 (a) From Eq. (3. 4-3), at any point (x; y ), K K K 1X 1X 1X g= gi = fi + ?: K i=1 K i=1 K i=1 i Then K K X 1X E ffi g + E f? i g: K i=1 K i=1 But all the fi are the same image, so E ffi g = f . Also, it is given that the noise has zero mean, so E f? i g = 0: Thus, it follows that E fgg = f , which proves the validity of E fgg = Eq. (3. 4-4). (b) From (a), g= K K K 1X 1X 1X gi = fi + ?: K i=1 K i=1 K i=1 i It is known from random-variable theory that the variance of the sum of uncorrelated random variables is the sum of the variances of those variables (Papoulis ). Since the elements of f are constant and the ? i are uncorrelated, then 1 2 ? g = ? 2 + 2 [? 2 1 + ? 2 2 + ? ? ? + ? 2 ]: f ? ?K K?
The ®rst term on the right side is 0 because the elements of f are constants. The various ? 2 i are simply samples of the noise, which is has variance ? 2 . Thus, ? 2 i = ? 2 and we ? ? ? ? have K 1 2 ? g = 2 ? 2 = ? 2 K? K? which proves the validity of Eq. (3. 4-5). Problem 3. 16 With reference to Section 3. 4. 2, when i = 1 (no averaging), we have 2 g(1) = g1 and ? g(1) = ? 2 : ? When i = K , g (K ) = K 1X 1 2 gi and ? g (K ) = ? 2 : K i=1 K? 28 Chapter 3 Problem Solutions We want the ratio of ? 2(K ) to ? 2(1) to be 1/10, so g g ? 2 (K ) g 12 ? 1 = K 2? 2 ?g(1) 10 ?? from which we get K = 10.
Since the images are generated at 30 frames/s, the stationary time required is 1/3 s. = Problem 3. 17 (a) Consider a 3 ? 3 mask ®rst. Since all the coef®cients are 1 (we are ignoring the 1/9 scale factor), the net effect of the lowpass ®lter operation is to add all the gray levels of pixels under the mask. Initially, it takes 8 additions to produce the response of the mask. However, when the mask moves one pixel location to the right, it picks up only one new column. The new response can be computed as Rnew = Rold ? C1 + C3 where C1 is the sum of pixels under the ®rst column of the mask before it was oved, and C3 is the similar sum in the column it picked up after it moved. This is the basic box-®lter or moving-average equation. For a 3 ? 3 mask it takes 2 additions to get C3 (C1 was already computed). To this we add one subtraction and one addition to get Rnew . Thus, a total of 4 arithmetic operations are needed to update the response after one move. This is a recursive procedure for moving from left to right along one row of the image. When we get to the end of a row, we move down one pixel (the nature of the computation is the same) and continue the scan in the opposite direction.
For a mask of size n ? n, (n ? 1) additions are needed to obtain C3 , plus the single subtraction and addition needed to obtain Rnew , which gives a total of (n + 1) arithmetic operations after each move. A brute-force implementation would require n2 ? 1 additions after each move. (b) The computational advantage is n2 ? 1 (n + 1)(n ? 1) A= = = n ? 1: n+1 (n + 1) The plot of A as a function of n is a simple linear function starting at A = 1 for n = 2. Problem 3. 18 One of the easiest ways to look at repeated applications of a spatial ®lter is to use super- Problem 3. 17 29 position.
Let f (x; y ) and h(x; y ) denote the image and the ®lter function, respectively. Assuming square images of size N ? N for convenience, we can express f (x; y ) as the sum of at most N 2 images, each of which has only one nonzero pixel (initially, we assume that N can be in®nite). Then, the process of running h(x; y ) over f (x; y ) can be expressed as the following convolution: h(x; y ) ¤ f (x; y ) = h(x; y ) ¤ [f1 (x; y ) + f2 (x; y ) + ? ? ? fN 2 (x; y )] : Suppose for illustrative purposes that fi (x; y ) has value 1 at its center, while the other pixels are valued 0, as discussed above (see Fig. P3. 18a).
If h(x; y ) is a 3 ? 3 mask of 1/9zs (Fig. P3. 18b), then convolving h(x; y ) with fi (x; y ) will produce an image with a 3 ? 3 array of 1/9zs at its center and 0zs elsewhere, as shown in Fig. P3. 18(c). If h(x; y ) is now applied to this image, the resulting image will be as shown in Fig. P3. 18(d). Note that the sum of the nonzero pixels in both Figs. P3. 18(c) and (d) is the same, and equal to the value of the original pixel. Thus, it is intuitively evident that successive applications of h(x; y ) will }diffuse} the nonzero value of fi (x; y ) (not an unexpected result, because h(x; y ) is a blurring ®lter).
Since the sum remains constant, the values of the nonzero elements will become smaller and smaller, as the number of applications of the ®lter increases. The overall result is given by adding all the convolved fk (x; y ), for k = 1; 2; :::; N 2 . The net effect of successive applications of the lowpass spatial ®lter h(x; y) is thus seen to be more and more blurring, with the value of each pixel }redistributed} among the others. The average value of the blurred image will be thus be the same as the average value of f (x; y ). It is noted that every iteration of blurring further diffuses the values outwardly from the starting point.
In the limit, the values would get in®nitely small, but, because the average value remains constant, this would require an image of in®nite spatial proportions. It is at this junction that border conditions become important. Although it is not required in the problem statement, it is instructive to discuss in class the effect of successive applications of h(x; y ) to an image of ®nite proportions. The net effect is that, since the values cannot diffuse outward past the boundary of the image, the denominator in the successive applications of averaging eventually overpowers the pixel values, driving the image to zero in the limit.
A simple example of this is given in Fig. P3. 18(e), which shows an array of size 1 ? 7 that is blurred by successive applications of the 1 ? 3 mask 1 h(y ) = 3 [1; 1; 1]. We see that, as long as the values of the blurred 1 can diffuse out, the sum, S , of the resulting pixels is 1. However, when the boundary is met, an assumption must be made regarding how mask operations on the border are treated. Here, we used the commonly made assumption that pixel value immediately past the boundary are 0. The mask operation does not go beyond the boundary, however. In this example, we 30 Chapter 3 Problem Solutions ee that the sum of the pixel values begins to decrease with successive applications of the mask. In the limit, the term 1=(3)n would overpower the sum of the pixel values, yielding an array of 0zs. Figure P3. 18 Problem 3. 19 (a) There are n2 points in an n ? n median ®lter mask. Since n is odd, the median value, ? , is such that there are (n2 ? 1)=2 points with values less than or equal to ? and the same number with values greater than or equal to ? . However, since the area A (number of points) in the cluster is less than one half n2 , and A and n are integers, it follows that A is always less than or equal to (n2 ? 1)=2.
Thus, even in the extreme case when all cluster points are encompassed by the ®lter mask, there are not enough Problem 3. 20 31 points in the cluster for any of them to be equal to the value of the median (remember, we are assuming that all cluster points are lighter or darker than the background points). Therefore, if the center point in the mask is a cluster point, it will be set to the median value, which is a background shade, and thus it will be |eliminated} from the cluster. This conclusion obviously applies to the less extreme case when the number of cluster points encompassed by the mask is less than the maximum size of the cluster. b) For the conclusion reached in (a) to hold, the number of points that we consider cluster (object) points can never exceed (n2 ? 1)=2. Thus, two or more different clusters cannot be in close enough proximity for the ®lter mask to encompass points from more than one cluster at any mask position. It then follows that no two points from different clusters can be closer than the diagonal dimension of the mask minus one cell (which can be occupied by a point from one of the clusters). Assuming a grid spacing of 1 unit, the minimum distance between any two points of different clusters then must greater p han 2(n ? 1). In other words, these points must be separated by at least the distance spanned by n ? 1 cells along the mask diagonal. Problem 3. 20 (a) Numerically sort the n2 values. The median is ? = [(n2 + 1)=2]-th largest value. (b) Once the values have been sorted one time, we simply delete the values in the trailing edge of the neighborhood and insert the values in the leading edge in the appropriate locations in the sorted array. Problem 3. 21 (a) The most extreme case is when the mask is positioned on the center pixel of a 3-pixel gap, along a thin segment, in which case a 3 ? mask would encompass a completely blank ®eld. Since this is known to be the largest gap, the next (odd) mask size up is guaranteed to encompass some of the pixels in the segment. Thus, the smallest mask that will do the job is a 5 ? 5 averaging mask. (b) The smallest average value produced by the mask is when it encompasses only two pixels of the segment. This average value is a gray-scale value, not binary, like the rest of the segment pixels. Denote the smallest average value by Amin , and the binary values 32 Chapter 3 Problem Solutions of pixels in the thin segment by B. Clearly, Amin is less than B.
Then, setting the binarizing threshold slightly smaller than Amin will create one binary pixel of value B in the center of the mask. Problem 3. 22 From Fig. 3. 35, the vertical bars are 5 pixels wide, 100 pixels high, and their separation is 20 pixels. The phenomenon in question is related to the horizontal separation between bars, so we can simplify the problem by considering a single scan line through the bars in the image. The key to answering this question lies in the fact that the distance (in pixels) between the onset of one bar and the onset of the next one (say, to its right) is 25 pixels.
Consider the scan line shown in Fig. P3. 22. Also shown is a cross section of a 25 ? 25 mask. The response of the mask is the average of the pixels that it encompasses. We note that when the mask moves one pixel to the right, it loses on value of the vertical bar on the left, but it picks up an identical one on the right, so the response doesnzt change. In fact, the number of pixels belonging to the vertical bars and contained within the mask does not change, regardless of where the mask is located (as long as it is contained within the bars, and not near the edges of the set of bars).
The fact that the number of bar pixels under the mask does not change is due to the peculiar separation between bars and the width of the lines in relation to the 25-pixel width of the mask This constant response is the reason no white gaps is seen in the image shown in the problem statement. Note that this constant response does not happen with the 23 ? 23 or the 45 ? 45 masks because they are not }synchronized} with the width of the bars and their separation. Figure P3. 22 Problem 3. 22 33 Problem 3. 23 There are at most q 2 points in the area for which we want to reduce the gray level of each pixel to one-tenth its original value.
Consider an averaging mask of size n ? n encompassing the q ? q neighborhood. The averaging mask has n2 points of which we are assuming that q 2 points are from the object and the rest from the background. Note that this assumption implies separation between objects at least the area of the mask all around each object. The problem becomes intractable unless this assumption is made. This condition was not given in the problem statement on purpose in order to force the student to arrive at that conclusion. If the instructor wishes to simplify the problem, this should then be mentioned when the problem is assigned.
A further simpli®cation is to tell the students that the gray level of the background is 0. Let B represent the gray level of background pixels, let ai denote the gray levels of points inside the mask and oi the levels of the objects. In addition, let Sa denote the set of points in the averaging mask, So the set of points in the object, and Sb the set of points in the mask that are not object points. Then, the response of the averaging mask at any point on the image can be written as 1X R= ai n2 a 2S i a 2 3 X X 14 = oj + ak 5 n2 o 2S ak 2Sb j o 2 3 ” # 2X X 1 4q 1 = oj 5 + 2 ak n2 q 2 n oj 2So ak 2Sb ¤ 1? q
Q + 2 (n2 ? q 2 )B n2 n where Q denotes the average value of object points. Let the maximum expected average value of object points be denoted by Qmax . Then we want the response of the mask at any point on the object under this maximum condition to be less than one-tenth Qmax , or ¤ q2 1? 2 1 Q + (n ? q 2 )B < Qmax n2 max n2 10 from which we get the requirement · ?1=2 10(Qmax ? B ) n>q (Qmax ? 10B ) for the minimum size of the averaging mask. Note that if the background gray-level is p 0, we the minimum mask size is n < 10q . If this was a fact speci®ed by the instructor, 2 = 34 Chapter 3 Problem Solutions r student made this assumption from the beginning, then this answer follows almost by inspection. Problem 3. 24 The student should realize that both the Laplacian and the averaging process are linear operations, so it makes no difference which one is applied ®rst. Problem 3. 25 The Laplacian operator is de®ned as r2 f = @2 f @2 f +2 @x2 @y for the unrotated coordinates and as r2 f = @2f @2 f + 02 : 02 @x @y for rotated coordinates. It is given that x = x0 cos µ ? y 0 sin µ and y = x0 sin µ + y 0 cos µ where µ is the angle of rotation. We want to show that the right sides of the ®rst two equations are equal.
We start with @f = @x0 @f @x @f @y + @x @x0 @y @x0 @f @f = cos µ + sin µ: @x @y Taking the partial derivative of this expression again with respect to x0 yields µ¶ µ¶ @2 f @2 f @ @f @ @f @ 2f = cos2 µ + sin µ cos µ + cos µ sin µ + 2 sin2 µ: 02 2 @x @x @x @y @y @x @y Next, we compute @f @y 0 @f @x @f @y + 0 @x @y @y @y0 @f @f =? sin µ + cos µ: @x @y Taking the derivative of this expression again with respect to y 0 gives µ¶ µ¶ @2 f @ 2f @ @f @ @f @2 f 2 = sin µ ? cos µ sin µ ? sin µ cos µ + 2 cos2 µ: @y 02 @x2 @x @y @y @x @y Adding the two expressions for the second derivatives yields = 2 f @2 f @ 2f @2 f + 02 = +2 02 2 @x @y @x @y Problem 3. 26 35 which proves that the Laplacian operator is independent of rotation. Problem 3. 26 Unsharp masking is high-boost ®ltering [Eq. (3. 7-11)] with A = 1. Figure P3. 26 shows the two possible solutions based on that equation. The left and right masks correspond to the ®rst and second line in the equation, respectively. Problem 3. 26. Problem 3. 27 Consider the following equation: f (x; y ) ? r2 f (x; y ) = f (x; y ) ? [f (x + 1; y ) + f (x ? 1; y ) + f (x; y + 1) +f (x; y ? 1) ? 4f (x; y)] = 6f (x; y ) ? [f (x + 1; y ) + f (x ? ; y ) + f (x; y + 1) +f (x; y ? 1) + f (x; y )] = 5 f1:2f (x; y )? 1 [f (x + 1; y ) + f (x ? 1; y ) + f (x; y + 1) 5 +f (x; y ? 1) + f (x; y )]g ? ¤ = 5 1:2f (x; y ) ? f (x; y ) where f (x; y ) denotes the average of f (x; y ) in a prede®ned neighborhood that is centered at (x; y ) and includes the center pixel and its four immediate neighbors. Treating the constants in the last line of the above equation as proportionality factors, we may write f (x; y ) ? r2 f (x; y ) s f (x; y) ? f (x; y ): The right side of this equation is recognized as the de®nition of unsharp masking given in Eq. 3. 7-7). Thus, it has been demonstrated that subtracting the Laplacian from an 36 Chapter 3 Problem Solutions image is proportional to unsharp masking. Problem 3. 28 (a) From Problem 3. 25, @f @f @f = cos µ + sin µ @x0 @x @y and @f @f @f =? sin µ + cos µ @y 0 @x @y from which it follows that µ ¶2 µ ¶2 µ ¶2 µ ¶2 @f @f @f @f + = + @x0 @y 0 @x @y or “µ ¶2 µ ¶2 #1=2 “µ ¶2 µ ¶2 #1=2 @f @f @f @f + = + : 0 0 @x @y @x @y Thus, we see that the magnitude of the gradient is an isotropic operator. (b) From Eq. (3. 7-12), (3. 7-14) and the preceding results,