Traditional Culture Encyclopedia - Traditional stories - image processing

image processing

Chapter III Image Processing

The pixel value of the output image is determined only by the pixel value of the input image.

1. 1 pixel transformation

? Generate output pixels from pixels. Note that the pixels here can be pixels of multiple pictures.

1.2 color conversion

? There is a strong correlation between the channels of color images.

1.3 synthesis and drawing

? Extracting foreground objects from image background is called matting. Inserting an object into another image is called composition.

1.4 histogram equalization

Contrast and brightness parameters can improve the appearance of the image. In order to adjust these two parameters automatically, there are two methods. One method is to find the brightest and darkest values in the image and map them to pure white and pure black. Another method is to find the average pixel value of the image as the middle gray value of the pixel, and then try to reach the displayable value in the full scale.

Local adaptive histogram equalization uses different equalization methods for different regions. The disadvantage is that it will produce block effect, that is, the brightness at the block boundary is discontinuous. In order to eliminate this influence, the transfer function between moving windows or blocks is usually used for smooth interpolation.

1.5 application: tone adjustment

A common field of point operators is to manipulate the contrast and tone of photos.

The neighborhood operator corresponding to the point operator determines the output of the pixel according to the selected pixel and its surrounding pixels. Neighborhood operators are not only used for local tone adjustment, but also for image smoothing and sharpening and image denoising.

The important concepts of neighborhood operators are convolution and correlation, which are both linear translation invariant operators and satisfy the superposition principle and translation invariant principle.

Filling, when the convolution kernel exceeds the image boundary, it will produce boundary effect. There are many packaging methods, such as zero packaging, constant packaging, clamping packaging, overlapping packaging, mirror packaging and extended packaging.

? ? 2. 1 separable filter

If a convolution operation can be decomposed into one-dimensional row vector convolution and one-dimensional column vector convolution, the convolution kernel is said to be separable. The 2D kernel function can be regarded as a matrix k, which can be separated if and only if the first singular value of k is 0.

2.2 example of linear filter

The simplest filter is moving average or block filter, followed by bilinear filter (bilinear kernel) and Gaussian filter (Gaussian kernel), all of which are low-pass kernel, fuzzy kernel and smooth kernel. Fourier analysis is used to measure the effect of these kernel functions. There are Sobel operator and corner operator.

2.3 bandpass filter and steering filter

Sobel operator is an approximation of directional filter. Firstly, the Gaussian kernel is used to smooth the image, and then the directional derivative (Laplace operator) is used to act on the image to get the steering filter. Steering has potential locality and good scale space characteristics. Directional filters are usually used to construct feature descriptors and edge detectors, and linear structures are usually considered as edge-like.

Regional sum table refers to the sum of all pixel values in a certain area, also known as integral image. The effective calculation method is recursive algorithm (raster scanning algorithm). Region and table are used to approximate other convolution kernels, multi-scale features in face detection, and sum of squares of differences in stereo vision.

Recursive filter is called infinite impulse response (IIR), which is sometimes used to calculate two-dimensional distance function and connectivity, and can also be used to calculate large-area smoothing.

3. 1 nonlinear filter

Median filtering can remove shot noise, and its other advantage is to keep the edge smooth, that is, the edge is not easy to soften when filtering high-frequency noise.

The essence of the idea of bilateral filtering is to suppress pixels with large differences from the central pixel value, rather than suppressing a fixed percentage of pixels. On the basis of weighted filtering, the weight coefficient is controlled, that is, it depends on the definition domain kernel (Gaussian kernel) and the value domain kernel (similarity with the central pixel value), and the two are multiplied to get the bilateral filtering kernel.

Iterative adaptive smoothing kernel anisotropic diffusion.

3.2 morphology

Nonlinear filtering is often used in binary image processing. Morphological operators are the most common operators in binary images. The binary structural elements are convolved with the binary image, and the binary output is selected according to the threshold of the convolution result. Structural elements can be any shape.

Common morphological operations include dilation, erosion, semi-crossover, open operation and closed operation. Half of the sharp corners are smoothed, and small points and holes in the image are removed by opening and closing operations to smooth the image.

3.3 Distance conversion

Distance transformation By using the method of twice raster scanning, the distance to a curve or point set can be quickly pre-calculated, including urban street distance transformation and Euclidean distance transformation. Symbolic distance transformation is an extension of basic distance transformation, which calculates the distance from all pixels to boundary pixels.

3.4 connected domain

Detecting image connectivity is a semi-global image operation. Connectivity is defined as the area of adjacent pixels with the same input value. After dividing binary or multivalued images into connected forms, the statistical data, area, perimeter, centroid and second moment of each individual region are calculated, which can be used for region sorting and region matching.

? Using Fourier transform to analyze the frequency domain characteristics of the filter, FFT can quickly realize the convolution of large-scale kernels.

? Idea: In order to analyze the frequency characteristics of the filter, let a sine wave with a known frequency pass through the filter and observe the attenuation degree of the sine wave. Fourier transform can think that the input signal is sinusoidal signal s(x), and after passing through filter h(x), the output response is sinusoidal signal o(x)=s(x)*h(x), that is, the convolution of the two. Fourier transform is a simple list of amplitude and phase responses of each frequency. Fourier transform can be used not only for filters, but also for signals and images.

? The properties of Fourier transform: superposition, translation, inversion, convolution, correlation, multiplication, differentiation, domain scaling, real-valued image, Pablo Cerval theorem.

? 4. 1 Fourier transform pair

? Common Fourier transform pairs, continuous and discrete. Convenient for Fourier transform.

High frequency components will lead to aliasing in downsampling.

? 4.2 2D Fourier Transform

In order to process two-dimensional images and filters, a two-dimensional Fourier transform is proposed, which is similar to one-dimensional Fourier transform, except that scalar is replaced by vector and multiplication is replaced by vector inner product.

4.3 Wiener filter

Fourier transform can also be used to analyze the full spectrum of a class of images, and Wiener filter came into being. Assuming that this kind of image is located in a random noise field, the expected amplitude of each frequency is given by the power spectrum, and the signal power spectrum captures the first-order description of spatial statistics. Wiener filter is suitable for removing image noise with power spectrum p.

The characteristic of Wiener filter is that low frequency has unity gain and high frequency has attenuation effect.

Discrete Cosine Transform (DCT) is often used to deal with block-by-block image compression. Its calculation method is to dot-product pixels in a block with a width of n with a series of cosine values of different frequencies.

The essence of DCT transform is the optimal KL decomposition (approximation of PCA principal component analysis) of some small areas in natural images, and KL can effectively remove related signals.

The overlapping variant of wavelet algorithm and DCT can effectively eliminate the blocking effect.

4.4 Application: Sharpening, Blurring and Denoising

? Sharpening and denoising can effectively enhance the image. The traditional method is to use linear filter operators, but now nonlinear filter operators are widely used, such as weighted median and bilateral filtering, anisotropic diffusion and nonlocal mean, variational method and so on.

? Peak signal-to-noise ratio (PNSR) and structural similarity (SSIM) are commonly used to measure the effect of image denoising algorithm.

Up to now, the size of the output image of image transformation is equal to the size of the input image. In order to process images with different resolutions, for example, small images are interpolated to match the resolution of the computer, or the image size is reduced to speed up the execution of the algorithm or save storage space and transmission time.

Because we don't know the resolution required to process images, we construct an image pyramid from several different images, so as to carry out multi-scale recognition and editing operations. The better filters to change the image resolution are interpolation filter and downsampling filter.

? 5. 1 interpolation

In order to enlarge the image to a higher resolution, it is necessary to convolution the image with interpolation check. The commonly used methods of quadratic interpolation are bilinear interpolation, bicubic interpolation and window function. Window function is considered as the highest quality interpolator, because it can not only preserve the details in low-resolution images, but also avoid aliasing.

5.2 Downsampling

Downsampling is to reduce the resolution of the image. Firstly, the image is convolved with a low-pass filter to avoid aliasing, and then R samples are retained. Commonly used downsampling filters include linear filter, quadratic filter, cubic filter, window cosine filter, QMF-9 filter and JPEG2000 filter.

5.3 Multi-resolution representation

Through downsampling and interpolation algorithm, a complete image pyramid can be established, and the pyramid can speed up the search algorithm from coarse to fine, so as to find objects and patterns on different scales, or carry out multi-resolution fusion operation.

The most famous pyramid in computer vision is the Laplacian pyramid. The original image is blurred, subsampled by a factor of 2, and then stored in the next level of the pyramid.

? 5.4 wavelet transform

Wavelet is a filter to locate signals in spatial domain and frequency domain, and it is defined on different scales. Wavelet can be used for multi-scale directional filtering and denoising. Compared with the traditional pyramid, wavelet has better directional selectivity and provides a compact framework.

Lifting wavelet is called the second generation wavelet, which is easy to adapt to unconventional sampling topology and has multi-scale transformation with direction shifting. Their expression is not only too complete, but also directional.

? 5.5 Application: Image Fusion

Laplacian pyramid application, mixed composite image. In order to produce mixed images, each original image is decomposed into its own Laplacian pyramid, and then each band is multiplied by a smoothing weighting function, the size of which is proportional to the pyramid level. The simplest method is to create a binary mask image, generate a Gaussian pyramid based on this image, and then combine the Laplacian pyramid and Gaussian mask to generate the final image.

Compared with point operation, it changes the range of the image, while geometric transformation focuses on changing the clarity of the image. The initial method is global parameterized 2D transformation, and then attention will turn to more general deformation, such as local deformation based on mesh.

6. 1 parameter conversion

Parameter transformation transforms the whole image globally, in which the transformation behavior is controlled by a few parameters. The performance of reverse winding or reverse mapping is better than that of forward winding, mainly because it can avoid the problem of resampling holes and non-integer positions. In addition, high quality filters can be used to control aliasing.

Given the mapping from the target pixel x' to the original pixel x, the problem of image entanglement can be formalized as resampling the original image. A similar application of inverse method is optical flow method to predict optical flow and correct radial distortion of lens.

Interpolation filters in resampling process include quadratic interpolation, cubic interpolation, window interpolation, quadratic interpolation for speed, cubic interpolation for visual quality and window interpolation.

MIP mapping is a fast pre-filtering image tool for texture mapping.

MIP image is a standard image pyramid, and each layer is filtered by a high-quality filter instead of a low-quality approximation. When resampling, it is necessary to estimate the resampling rate r.

Elliptic weighted average filter (EWA), anisotropic filtering, multi-channel transformation.

Directional binary filtering and resampling operations can be approximated by a series of one-dimensional resampling and shearing transformations. The advantage of using a series of one-dimensional transformations is that they are more efficient than large and inseparable binary filtering kernels.

6.2 mesh-based deformation

In order to obtain more free local deformation, mesh winding is produced. Sparse control points, dense set, directed line segment segmentation, determination of displacement field.

6.3 Application: Feature-based morphology

Winding is usually used to change the appearance of a single image to form an animation, and it can also be used to fuse multiple images to produce a powerful deformation effect. Simply fading in and out between two images will lead to ghosting, but using image winding to establish a good correspondence, the corresponding features will be aligned.

Use some optimization criteria to clearly express the goal you want to change, and then find or infer the solution of this criterion. Regularization and variational method are used to construct a continuous global energy function that describes the characteristics of the solution, and then sparse linear systems or related iterative methods are used to find the minimum energy solution. Bayesian statistics models the noise measurement process and a priori assumptions about the solution space, which are usually encoded by Markov random fields. Common examples include surface interpolation of hash data, image denoising and restoration of missing areas, and images are divided into foreground and background areas.

? 7. 1 normalization

Regularization theory tries to fit the data in the severely under-constrained solution space with the model. That is, a set of measurement data points intersect or approximate a smooth surface. Such questions are morbid and ill-posed. The problem of recovering the complete image f(x, y) from the sampled data point d(xi, yi) in this way is called the inverse problem.

In order to define the smooth solution, the norm is usually defined in the solution space. For one-dimensional functions, the square of the first derivative of the function is integrated, or the square of the second derivative of the function is integrated. This energy measurement is an example of a functional sum operator that maps functions to scalar values. This method is called variational method, which is used to measure the change (non-smoothness) of a function.

7.2 Markov random field

7.3 Application: Image Restoration