## Panoramic photograph stitching

I found the article “Fast image blending using watersheds and graph cuts,” by N. Gracias, M. Mahoor, S. Negahdaripour and A. Gleason (Image and Vision Computing 27(5):597-607, 2009) quite clever, and decided to try it out myself. Here’s a little demo and the code I wrote.

I had these two images lying around, that I took some years ago at Big Sur:

I’m not bothering with warping the images to undo the projection, so only one point in the two images can be correctly aligned. This actually makes it easier to see the effect of this method. I read them into the variables `a`

and `b`

.

The first job is to find that common point. I used the DIPimage command `dipgetcoords`

to get the coordinates of a mouse click in each image. I clicked on one small rock towards the middle of the area of overlap of the two images. Next I expanded and shifted the two images:

sz = imsize(a); dipshow(1,a); ca = dipgetcoords(1,1); dipshow(1,b); cb = dipgetcoords(1,1); s = ca - cb - [sz(1),0]; z = newim(sz); a = iterate('cat',1,a,z); b = iterate('cat',1,z,b); b = iterate('dip_wrap',b,s);

Note how in DIPimage you sometimes need to use `iterate`

to work with color images.

The next step is to look at the inverse of the absolute difference between the two images. I set the inverse difference to 0 in the areas where only one image is available.

d = 255 - max(abs(a-b)); d(0:sz(1)-1+s(1),:) = 0; d(sz(1):imsize(d,1)-1,:) = 0;

In the paper they apply a watershed to the blurred difference image. They use the blur to avoid getting too many regions: every single local minimum produces a region, and even small amounts of noise create an inordinate amount of local minima. The watershed function in DIPimage has built-in region merging, so this blur is not necessary. I’m using the low-level interface to the watershed algorithm because it has the option of returning a labeled image of the watershed regions instead of a binary image of the watershed lines:

w = dip_watershed(d,[],1,50,1e9,0);

The image above shows the watershed lines (right-click the image and select “view image” to get a closer look), but the image `w`

computed above is a labeled image.

The next step is to generate a graph in which each node represents one of the watershed regions, and the edges are the gray-value of the image `d`

in between the regions. The paper uses the maximum difference between the two regions (lowest grey-value). However I got better results using the integrated grey-value over the boundary between the two regions. This way it takes the difference in boundary lengths into account. The code to compute this is rather slow and inelegant, but it does the job:

labs = unique(double(w)); labs(1) = []; % labs == 0 doesn't count N = length(labs); V = zeros(N,N); % vertices for ii=1:length(labs) m = w==labs(ii); l = unique(double(w(bdilation(m,2,1,0)))); l(1) = []; % l == 0 doesn't count l(l==labs(ii)) = []; for jj=1:length(l) kk = find(l(jj)==labs); if V(ii,kk) == 0 n = w==l(jj); n = closing(m|n,2,'rectangular') - m - n; if ~any(n) V(ii,kk) = 0.01; else V(ii,kk) = sum(d(n)); end V(kk,ii) = V(ii,kk); end end end

The code above generates a matrix `V`

, in which `V(i,j)`

is the weight of the edge between vertices `i`

and `j`

. `labs`

is the list of labels used in the image `w`

, such that vertex `i`

has label ID `labs(i)`

. The code loops over all labels, extracts the area corresponding to one label, and applies a small dilation to find label IDs of the neighboring regions. Then, looping over each neighbor, it finds the area in between the two regions and integrates the gray-values of `d`

.

According to the paper, the max-flow/min-cut of this graph is the ideal line over which to stitch the two images together. To compute the max-flow/min-cut line I downloaded the package maxflow by Miki Rubinstein from the MATLAB File Exchange. The big regions on the left and right side of the image will be the source and sink for computing the flow. The following bit of code finds the label IDs for these two regions and removes them from the matrix `V`

. An array `T`

is created with the two deleted columns, it shows which vertices are connected to the source and sink:

N1 = double(w(1,1)); N2 = double(w(end-1,1)); kk1 = find(labs==N1); kk2 = find(labs==N2); T = [V(:,kk1),V(:,kk2)]; V([kk1,kk2],:) = []; V(:,[kk1,kk2]) = []; T([kk1,kk2],:) = []; labs([kk1,kk2]) = []; [flow,L] = maxflow(sparse(V),sparse(T));

The matrix `L`

contains 0 or 1 for each vertex, indicating whether it is connected to the source or the sink. I now color each region in the image `w`

with the correct label ID, then propagate labels to remove all the unassigned pixels (the watershed lines):

w = setlabels(w,labs(L==0),N1); w = setlabels(w,labs(L==1),N2); w = dip_growregions(w,[],[],1,0,'low_first'); w = w==N2;

The final composition is trivial using this mask:

out = a; out(w) = b(w);

For comparison, I’ve stitched the image in two other ways: One is a dumb method, pasting the two images together with a straight line. Another is the method you see everywhere, also using a straight line, but blending the two images together using a little bit of blurring:

m = newim(w,'bin'); m(round(imsize(m,1)/2):end,:) = 1; out_dumb = a; out_dumb(m) = b(m); m = gaussf(m,3); out_common = a*(1-m) + b*m;

As you can see, the blurring (middle) hides the joint quite well, but it can be really obvious if there’s larger differences between the two images across the joint. The new method (right) makes it really difficult to see where the images were stitched. Because of the lack of alignment of the two images, some rocks can be seen twice in the stitched image. Oh well.

Feel free to download the script I used to generate the images on this page.

**Edit (April 3 ^{rd}, 2011):** For a simpler, faster method to accomplish more or less the same thing, see this new post.

I love the amount of details you gave in your instructions through this article. Great example. Looking forward to more from you.

[...] If you got access to an Matlab installation, you can also try to stitch a panorama yourself. At Cris’s Image Analysis Blog I found a nice tutorial to do just that. For two steps he is using his DIPimage toolbox, but you [...]

I read your code, it’s very nice and clever, but when I tried it it took me a huge time to compute the boundary differences for all regions. please can you post the time it took for you ? and if there is any solution to speed the algorithm, please inform me because i really need it.

I seem to remember this whole think taking a minute or two to compute. As you can see, there are two nested loops. This type of construct is typically slow in MATLAB (as loop speedup does not work if you call a function within the loop, nor if you use any objects). Rewriting the code in C would speed it up considerably. Also there are much quicker methods to find neighboring regions than using the

dilation, and much quicker methods of discovering the boundary pixels between two regions than using theclosing. I used these because they made it easy to implement the algorithm. If you need speed, I suggest you start by replacing those!Also interesting to note is that the code is much faster if you have fewer regions to process. By increasing the merging parameters in the

watershedyou’ll reduce the number of regions, thereby reducing computation time.Thanks a lot for taking a time to reply, I really appreciate that. Excuse me to bother you a little bit more, that’s because i’m working on my graduation project titled “Image registration” especially image mosaicing, and I would like to use graph cuts to perform the blending. If you can suggest any quicker method to get the boundary between two regions, it’ll be very kind and will help me a lot. Thanks in advance and excuse me (another time).

I can imagine a good method would be to run through the image once, and for each watershed pixel, look at the label ID of its 4 neighbors. If there are two or more different IDs, you’d connect those IDs with a vertex in the graph, and accumulate the pixel’s value for the vertex’s weight. But I’ve never done this, there might be other and/or better methods. The slow method I used in the post is simply the first, simple thing that I thought of. I never had to build a graph like this before! :)

Thank’s a lot for your advice. I’ll try it.

Hi Cris,

I am working on image mosaicking myself. I recently came across your blog and found the information here interesting and useful. Currently I am looking for standard colour test images used in stitching algorithms. More specifically, I am looking for overlapping images that have already been registered, and have differences in light intensity and colour. Do you happen to know if a standard dataset of such images exists? If not, could I use the two images you are compositing in this post ?

Thanks a lot for taking the time!

Sincerely,

Sper

Hi Sper,

I don’t know any such standard dataset, though I wouldn’t be surprised if one existed somewhere. I recommend you look in the literature, you can start with the paper that I’m referencing in this post and then go on to the papers referenced therein. Sometimes they do use publicly available datasets.

Feel free to use the images in this post also.

Thank you very much for your answer, Cris.

I had done some searching before my original post and couldn’t find the source images different authors are displaying in their papers, this is why I thought I asked you.

It would be good for comparison purposes for researchers to work on the same test images, this is why I was wondering whether such a database does exists for image stitching applications. I know such databases do exist for face recognition applications: Yale face database, or in general for testing image processing algorithms: standard images widely used such as lena, peppers, etc. From my reading so far, I haven’t seen repeating images in the mosaicking/ stitching papers I have read, but I will go through everything again and see if I find some reference.

Again, thank you for your time and for allowing me to use the images in this post.

Sper

Cris,

From the filesize information (in pixels), your input images have a height of 584 pixels and the output has a height of 576 pixels. Could you please tell me if this is due to cropping the result (to avoid some border artifacts, perhaps?) or to the jpeg images posted here having been resized for displaying purposes?

Thanks a lot for taking the time!

Sper

Sper,

The two images are slightly shifted vertically. I cropped the output to get the largest rectangle completely covered with image data. It is also possible to take the smallest rectangle that contains all the image data, then you’d get an image that is slightly taller than the input images.

Hi Cris,

thanks for this code it is very interesting and helpful, i tested it with your tow images, it gave a good result, but when i use 2 other images it doesn’t give a result and i think that is because of the 2 parameters max-depth and max-size ( in your case max-size== 1e9 and max-depth==50) the tow others when i put (1e4 and 150) i gave a good result.

can you tell me if really the problem is caused by these 2 parameters and how can I find them.

thanks

Panda,

These two parameters control the region merging. The larger they are, the more regions will be merged, leading to fewer nodes in the graph for the next step. Fewer nodes is good, but there need to be enough nodes so that the algorithm can actually make a choice. The max-size parameter is the maximum number of pixels in a region that can be merged with a neighbour. I set this to 1e9, which is practically infinity, because I don’t want the size of the region to limit merging. The max-depth parameter is the maximum grey-value difference within a region that can be merged. This value you should choose here depends on the range of values in your photo. The 50 that I chose is about 1/5 of the total range in the images (0-255). If your images are 10 bit or 14 bit, as some cameras can produce today, you’ll need to use a much larger value. Maybe you can set this value to

max(a)/5or something like that?hello sir..

i want to learn MATLAB.. if u teach me MATLAB then it will be great pleasure to me..

i need ur support

Thanks,

vikas sharma

Hi Vikas,

I learned MATLAB by using it to try to solve my problems. Practice is the best teacher. Also, I learned a lot from reading the MATLAB Newsgroup and trying to figure out solutions to people’s problems.

Good luck!

[...] Panoramic photograph stitching – an alternative cute way to stitch images using the max flow algorithm. [...]

Hi Cris,

I’m trying to stitch colored images using Phase Correlation with MATLAB. I have written a code by studying various papers. The code runs but the output generated is completely different from what is expected. I’ve pasted the code. It would be really helpful if you could suggest me the mistake in my code or the technique I’m using.

The code accepts 2 images. It finds out the angle to be known for rotation of 2nd image w.r.t 1st image to align the 2nd image with the 1st image. It displays the angle theta and then it displays the translation required for overlapping the images.

I also want to know that is there any way by which we can turn gray image to rgb in matlab?

Thanks in advance

a1=imread(‘D:\i1.jpg’);

b1=imread(‘D:\i2.jpg’);

a=rgb2gray(a1);

b=rgb2gray(b1);

vect=zeros(1,360);

[x y]=size(a);

j=zeros(x,y);

f1=fft2(a);

f2=zeros(x,y);

q=zeros(x,y);

cor=zeros(x,y);

t=zeros(x,y);

for k=1:360

j=imrotate(b,k,’bilinear’,'crop’);

f2=fft2(j);

q=f1.*conj(f2);

cor=q./abs(f1.*f2);

t=ifft2(cor);

vect(k)=max(max(t));

end

[val1 theta]=max(vect);

display(theta)

brot=imrotate(b,theta,’bilinear’,'crop’);

imshow(brot);

f2=fft2(brot);

q=f1.*conj(f2);

cor=q./abs(q);

t=ifft2(cor);

[val2 txy]=max(t(:));

[tx ty]=ind2sub(size(t),txy) ;

display(tx)

display(ty)

Hi Gaurav,

Cross-correlation over the whole two images only works correctly if there is a significant amount of overlap. If the two images overlap only partly, as you’d get with panoramic photo stitching, the output will be simply wrong. Try cutting smaller regions from one image, and finding the best matching position in the other image. If you do that for many smaller regions, you should be able to discard bad matches and find the shift that matches the two images. Not trivial, but doable. A better solution is using a feature point detector such as SIFT, and then matching points using an algorithm such as RANSAC. I’m not an expect in either of these, but the MATLAB File Exchange has many submissions for these techniques.

Converting a grey-value image to colour is simply repeating the same grey value for R, G and B. Maybe

`repmat(a,[1,1,3])`

.[...] Panoramic photograph stitching: using max-flow [...]

Hi Cris, I am unable to run the bigsur.m file. I am getting the following error. Can you please help me how to remove this error?

??? Error using ==> dip_image.subsref at 196

Index exceeds image dimensions.

Error in ==> bigsur at 65

a = a(0:imsize(a,1)+s(1)-1,:);

Hi Ramzan,

Please read the comments in the M-file: the files ‘bigsur1.jpg’ and ‘bigsur2.jpg’ that you find at the top of this post are the downsampled versions of ‘BigSur1.JPG’ and ‘BigSur2.JPG’ that the script looks for. Just skip the first 6 lines of code, and you should be fine!

hi cris. i am engineering student and working on my BE project since last 6 months. i am trying for ’3D VIEW’ of an image. It accepts image and makes it 3D. but not getting at all. plz help mi out from this.

Hrushikesh,

I’m not at all clear with what you’re asking.

Hi Cris. I am a student working on seamless image stitching. i found that the watershed segmentation ‘dip_watershed’ used in this blog is quite different from the method adopted by MATLAB, I wonder which algorithm does it use, and it will be great if a corresponding paper can be provided.

Hyman,

I don’t have a reference for it. The algorithm produces a correct watershed if the watershed lines do not cross plateaus (regions in the image with constant value). You can avoid these by adding a little bit of noise to the image. The algorithm includes region merging, which is controlled by the last two input parameters. In short, a watershed line is not drawn if one of the two adjacent regions has at most

`max_size`

pixels and the grey-value difference between its minimum and the location of the potential line is at most`max_depth`

. If the condition is satisfied, the two regions are merged and continue growing as a single region. You can use these two parameters to include an implicit H-minima transform or do the equivalent of a size-based region merging, or some combination of the two.For correct plateau handling, use the a combination of the functions

`minima`

and`waterseed`

.Hi Cris,

first thanks for sharing this blog.

I download two images “bigsur1.jpg” and “bigsur2.jpg”, and my DIPimage version is 2.5.1. After I run “bigsur.m” in Matlab 2013a, there are some errors:

>> bigsur

Error using dip_image/subsref (line 196)

Index exceeds image dimensions.

Error in bigsur (line 65)

a = a(0:imsize(a,1)+s(1)-1,:);

May you have a look at it?

Thanks!

Panovr,

Please look at the comment by Ramzan Ullah (April 18th, 2013) a little above yours, and my answer to it.

Hi Cris,

thanks for the tip, and it solved the problem.

Hi Cris,

for this statement “d = 255 – max(abs(a-b));”: I know abs(a-b) will get a dip_image object of absolute value of a – b. But what the meaning of max(abs(a-b))?

Thanks!

Hi panovr,

`max(image)`

gets the maximum value of`image`

. So`max(abs(a-b))`

gets the maximum absolute difference between the two images.Hi Cris,

because a and b are dip_image object of color images, if max(image) gets the maximum value of image, then d = 255 – max(image) should be a scalar.

However, it seems that d is also a dip_image object, not a scalar.

panovr,

My bad!

`a`

and`b`

are colour images. In this case, the`max`

function takes the maximum over the three colour channels:`max(a)`

is equivalent to`max(max(a{1},a{2}),a{3})`

.Hi Cris,

what the meaning of this statement: “n = closing(m|n,2,’rectangular’) – m – n;”? In particular, m | n ?

Hi Cris,

as my understanding:

n = closing(m|n,2,’rectangular’) – m – n;

this statement aims to find the boundary between regions m and n represent (m and n are region mask indeed);

if ~any(n)

V(ii,kk) = 0.01;

else

V(ii,kk) = sum(d(n));

this if-else statement try to assign a weight between node ii and node kk. If the boundary mask n is not empty, the weight is the integrateed gray-values of d as you said.

I just wonder that when the condition ~any(n) will be satisfied, and how you select weight 0.01?

panovr,

Indeed,

`n = closing(m|n,2,’rectangular’) – m – n;`

is a quick-and-dirty way of finding which pixels form the boundary between the two regions. It’s expensive for what it does, but doesn’t require much code writing, which is good.Earlier code makes sure that

`m`

and`n`

are neighbours, but it is still possible that the closing finds no pixels in the boundary. In this case, I assign a small non-zero value to the edge in the graph. (`~any(n)`

is true if all of the pixels in`n`

are zero.)Hi Cris,

you said that “The big regions on the left and right side of the image will be the source and sink for computing the flow.”, and the code is:

N1 = double(w(1,1));

N2 = double(w(end-1,1));

kk1 = find(labs==N1);

kk2 = find(labs==N2);

As my understanding, N1 and N2 are the label values of the two big regions. kk1 and kk2 are the corresponding vertex id.

Why N1 and N2 are the big regions, I mean that why N1 = double(w(1,1)) not double(w(0,0)), and N2 = double(w(end-1, 1))?

panovr,

`w`

is the result of the Watershed. The algorithm I used here is a little lazy, and marks all pixels at the border of the image as watershed pixels (i.e. assigns no label to them). This just simplifies the algorithm and makes it a lot faster, as there is no need to check if a pixel is on the boundary before trying to access its neighbours. But this is why I need to test the pixel`w(1,1)`

,`w(0,0)==0`

.PS: There’s no need to ask the same question twice. Sometimes you’ll need to wait for an answer, I don’t have infinite time.

Hi Cris,

I thought my question was not submitted successfully, sorry for ask the question twice.

I’m very appreciate for you answer, thanks!

Hi Cris,

when crop both images, the code below:

a = a(:, 0:imsize(a, 2) – s(2) – 1);

b = b(:, 0:imsize(b, 2) – s(2) – 1);

maybe should be:

a = a(:, 0:imsize(a, 2) + s(2) – 1);

b = b(:, 0:imsize(b, 2) + s(2) – 1);

panovr,

You’re right! Fixed it in the script.

Hi Cris,

after calculated the maxflow, you color the labels image as:

w = setlabels(w, labs(L==0), N1);

w = setlabels(w, labs(L==1), N2);

1. I know labs(L==0) is to get all the labels which are connected to Source. It seems that “setlabels” is not a Matlab function, and it is a dipimage function. May you interpret it more?

2. After set labels of w, I want to display w (with watershed lines). I know Matlab has a nice “label2rgb” function, is it feasible in dipimage? (If I use dipshow, w will be showed as a black image)

3. the script uses “w = dip_growregions(w, [], [], 1, 0, ‘low_first’);” to eliminate watershed lines. From debug, I can see N1 = 83, N2 = 1. So after color the labels, all labels are divided into 83 and 1 (not included the watershed lines). How “dip_growregions” woks, and what the meaning of parameter “low_first”?

4. If we can show the labels image w, I think we can visualize the region growing process.

Thanks!

panovr,

1. You can type

`help setlabels`

to learn more about the function, and`edit setlabels`

to see exactly what it does.2. In DIPimage, you can type

`dipshow(w,'labels')`

to see each label with a different colour. Or, you can just type`w`

and select the ‘labels’ mapping from the menu.3. You can read the documentation for the

`dip_GrowRegions()`

function online: http://qiftp.tudelft.nl/dipref/GrowRegions.html. In short, because no grey-value image is given as input, the ‘low_first’ flag is ignored. But you need to give it because parameters are directly passed to a C function.I very strongly recommend you read the DIPimage User Manual. You could have figured out answers to all these questions there. :)

Hi Cris,

thanks for pointing out the direction!

Hi Cris,

if we blur the inverse difference image before watershed transform, the generated regions number is much less than the original code. I did a quick test, and found that the compositing result is comparable with the original one. May you experiment with it by add dip_gauss?

panovr,

Indeed, blurring the image removes local minima, which means that the watershed will produce fewer regions. However, the boundary of the regions will also shift, so it is not recommended to blur too much. Instead, try increasing the region merging that the watershed algorithm in DIPimage does. Region merging does not shift any boundaries. In the code above, I used 50 (any region with less depth than this will be merged with a larger region), and 1e9 (any region smaller than this number of pixels will be merged). 1e9 effectively means that we don’t look at the region size. But you can tweak these parameters to force more regions to be merged.