馃攳
Merge sort algorithm - YouTube
Channel: mycodeschool
[0]
So far in this series on sorting algorithms,
we have talked about 3 of the sorting
[5]
algorithms - selection sort, bubble sort and
insertion sort. And we have seen that these
[13]
algorithms are not so fast, they're all O
(n^2) in average case. Now in this lesson,
[21]
we're going to talk about one algorithm which
is O(nlogn) in worst case and this algorithm
[30]
is Merge Sort, O(nlogn) in worst case is definitely
a lot better, a lot faster than O(n^2) in
[39]
average case. So, in this lesson, we will
study, discuss and analyze Merge Sort Algorithm.
[46]
There's one pre-requisite for this lesson.
You should have at least heard about recursion
[53]
as a programming concept. Ok, so let's get
started. Once again, I will pick up a very
[58]
simple sorting scenario. Given a list of integers
in a form of an array, something like this.
[65]
Let's name this array A. We have 8 elements
in the array. So, we have indices from 0 to
[70]
7. We want to sort this list in increasing
order of value of integers, so the list should
[76]
be rearranged like this. Our approach in Merge
Sort Algorithm will be entirely different
[84]
from what we've done in previous sorting algorithms,
where we're rearranging the elements or changing
[90]
their positions only by swapping or shifting.
What we're going to do here is we're going
[95]
to break this problem into sub problems. We
will divide this array into two equal halves,
[102]
or rather two possibly equal halves. So, we
will find some middle position and we can
[108]
say that all the elements before this position
belong to the first half and all the elements
[113]
after, on or after this position belong to
the second half. If an array would've odd
[119]
number of elements, one of the halves will
have more elements than the other half. We've
[124]
8 elements in the original array here. So
we've two equal halves. Now think about this,
[130]
what if we are somehow able to sort these
two halves, and let's say these two halves
[135]
are entirely different arrays. They are created
separately in memory by copying values from
[141]
the original array A. If we're somehow able
to sort these two arrays, then we can merge
[147]
these two lists together in original list
in sorted order. Of course, there has to be
[154]
some algorithm to merge two sorted arrays
into a third array in sorted order. The algorithm
[161]
will be pretty straightforward, let's say
this particular sub array is named L and this
[167]
particular sub array is named R; L for Left
and R for Right. Because all the elements
[172]
in A are present either in L or R, we can
start overwriting A from left to right. We
[179]
can start at 0th position in A. At any point,
the smallest element will be either the smallest
[186]
'unpicked' in L, or the smallest 'unpicked'
in R, and let's say we colour code the smallest
[193]
'unpicked' in L and R by this yellow colour.
What we can do is we can pick the smaller
[200]
of the two smallest 'unpicked in L and R.
We've two candidates here - 1 and 3. 1 is
[207]
smaller, so we can write 1 here at 0th index.
And now we can look for the number to fill
[212]
at 1th index in A. Let's say, the cells of
the 'picked' elements will be colour coded
[218]
in green. If I have to write 'pseudo-code'
to merge the elements of the two sorted arrays
[224]
into a third array, let's say we want to write
a function named 'Merge' that will take three
[232]
arrays as arguments - Left(L), Right(R) and
the array(A) in which it should be merging
[237]
two sorted arrays- Left(L) and Right(R). Then,
I will do something like this; I will take
[242]
the variable that will store the number of
elements in L, and another variable that will
[249]
store the number of elements in R. In a real
program, we can also pass these two values
[254]
to the function. Now, I will take three variables
-- I, j and k; and initialize them all to
[262]
0. Let's say, I will mark the index of the
smallest 'unpicked' in L, 'j' will mark the
[271]
index of the smallest unpicked in R and k
will mark the index of the position that needs
[278]
to be filled in A for our example here at
the stage of i=1, j=0 and k=1; because we've
[288]
already filled one element at index 0 in A.
But when we will start, we will start with
[293]
all three i, j and k as 0. And now, my code
will go like, while i < nL, where nL is the
[303]
number of elements in L, for i to be a valid
index it should be less than nL, and similarly
[309]
for j to be valid, it should be less than
nR. So, while both these two indices are valid,
[315]
both these indices i and j are valid, we can
say something like if L[i] is less than or
[320]
equal to R[j], so we are comparing the smallest
in, or rather the smallest 'unpicked' in L
[327]
with the smallest 'unpicked' in R. In this
case, at kth position in A, we will write
[334]
L[i], remember we are overwriting A and that's
not a problem and now I need to increment
[340]
k and need to go to the next position, and
I also need to increment i to go to the next
[346]
unpicked in L. And if the condition is not
true and R[j] is less than L[i], then A[k]
[354]
will be R[j]. And once again, we need to increment
k and we need to increment j to go to the
[362]
next 'unpicked' in R. This k = k + 1 is in
both the conditions if as well as else, so
[372]
I've moved it out. Coming back to our example
here, i and j are both valid in this, now
[379]
we'll compare if L[i] is less than or equal
to R[j], well yes, L[i] is less than R[j].
[385]
So we need to pick 2 for 1th index, and we
need to increment both i as well as k, and
[393]
for the next position it's between 4 and 3.
3 will go. And we will increment j and k this
[401]
time and next it's between 4 and 5, next it's
between 5 and 6, next it's between 7 and 6,
[411]
and after 6 has gone, we're done with all
the elements in L. i is equal to 4 now, which
[417]
is not a valid index. So, in the while loop,
this condition i is less than no. of elements
[423]
in L will be false, and this is definitely
one probability, one of the arrays L or R
[429]
will exhaust first. In that case, we need
to pick all the elements of the other array
[435]
and fill rest of the positions in A. After
we come out of this while loop, we can write
[441]
statements like while i is less than no. of
elements in L, so we can check whether there
[448]
are leftovers in L, we can do the same thing,
we can say A[k] = L[i] and then we can increment
[457]
i as well as k. I'm short of space here, so
I'm writing multiple statements in the same
[463]
line. And similarly, we can write while loop
like while j is less than no. of elements
[468]
in R, we can fill in A[k] with R[j] and this
time, j and k will be incremented. Once, we're
[476]
out of this first while loop, only one of
these two while loops will execute, because
[481]
only one of the sub lists or sub arrays will
have leftovers. For this particular example,
[487]
this third while loop will execute, because
right sub array has leftovers, so we will
[493]
fill up all the remaining positions, so finally
we will have a sorted arrangement in A. This
[502]
is our Merge logic and there are couple of
ways in which we can clean up this code further,
[508]
but for now let's just understand the logic.
And now coming back to where we'd started.
[516]
In the beginning, we had imagined that if
somehow these two sub arrays or sub lists,
[523]
uh I'm redrawing the unsorted original array
and unsorted sub lists. So, we'd said that
[530]
if we're somehow able to sort these two lists,
then we can merge them back into the original
[536]
list. But, of course, we need to have a deterministic
logic to sort these two sub lists or sub arrays
[543]
also. And the logic is, we can break these
sub lists even further. So, this sub list
[552]
comprising of four elements - 2,4,1 and 6
can be further divided into these two halves,
[559]
and this list comprising of 8,5,3,7 can be
divided into these two sub lists 8,5 and 3,7.
[567]
The solution for 2,4,1,6 this particular sub
list can be constructed after we sort these
[577]
two sub lists 2,4 and 1,6 and merge them back.
And similarly, we can sort these two sub lists
[589]
and merge them back to sort this 8,5,3,7 sub
list. Once again, we've these four sub lists
[599]
of two elements each and they can also be
divided. What we're basically doing here is
[605]
that we're reducing a problem into sub problems
in a recursive or self-similar manner. And
[612]
at any step, once we get solution of sub problems,
we can construct the solution of the actual
[619]
problem. If we've two sorted sub lists, we
can sort the parent list also. We can go on
[626]
reducing a sub list only till we've more than
one element in the sub list. Once we reach
[631]
a stage where we've only one element in a
sub list, then we cannot reduce that sub list
[638]
any further. So, once we reach a state where
our sub list has only one element, our recursion
[646]
will end. And a list with only one element
is always sorted. We do not need to do anything
[651]
to sort it. Now, at this stage, you can start
combining back or merging the sub lists. So,
[659]
these two sub lists can be merged. Let's say
we will depict the cells in parted sub lists
[667]
in green. We've already discussed the merge
logic. This sub list 2, 4 will still be the
[674]
same after merge also. Sub lists 1 and 6,
these two sub lists with only one element
[681]
each will also merge. And now we can merge
2, 4 and 1, 6. I'm coming to this side. All
[689]
these sub lists with one element are already
sorted, so we'll start merging back. Finally,
[696]
these two sorted sub lists 1,2,4,6 and 3,5,7,8
can be merged back to original list A. And
[706]
now let's write pseudo-code for this algorithm.
I'll write a function named Merge Sort and
[712]
that will take an array A as argument. In
the function, first I'll take the variable
[717]
that will store the number of elements in
A. And now we can partition A into two halves.
[725]
We need to partition A only if n is greater
than 1. If n is less than 2, then we having
[733]
only one element in the array, so, the array
is already sorted, we can simply return. Else,
[738]
what we can do is, we can first find out a
middle position and then we can create two
[745]
arrays - one of size equal to mid and another
of size (n- mid). So, first array will have
[754]
all the elements starting index 0 till mid-1.
We can just fill the elements. We can run
[761]
a loop from 0 to mid-1, so there will be mid
elements in all and we can say left[i] = A[i].
[770]
And then we can run another loop from index
mid till n-1, so there will be n-mid elements
[778]
in all, and we need to fill in right[i-mid]
as A[i]. Now that we've created left and right
[786]
sub lists, we can first make a recursive call
to sort the left sub list, and once we're
[793]
done with sorting the left sub list, we can
make a recursive call to sort the right sub
[799]
list. And once both left and right sub lists
are sorted, we can make a call to the Merge
[805]
function that we'd written before to merge
left and right sub lists into A. It's really
[813]
important to visualize how this recursion
will actually execute. Once again, I'll start
[818]
over with an unsorted array and let's say
this array is passed to the Merge Sort function.
[824]
Now, let's run through this code and see what's
happening. I'll start with the first line.
[830]
We can calculate n - the number of elements
in the array, the no. of elements in this
[835]
array is 8. It's not less than 2, this condition
is not true. So, we will not return and exit
[843]
from this function, we will move forward.
We'll calculate the mid index. Now, n is 8,
[849]
so, mid will be 4 and now we will create two
arrays - left and right, one of size mid and
[856]
another of size (n-mid). We will fill up these
arrays, the first 4 elements will go to left
[864]
and the next four will go to right. And now,
we're making a recursive call. When a function
[869]
calls itself, then such a call is called recursive
call. A function calling itself is not much
[877]
different from a function A calling another
function B. At this stage, the execution of
[882]
this particular function call with this array,
with 8 elements, as argument is paused and
[890]
the machine says that hey, let me go and finish
this particular function call and then I'll
[894]
come back to you. The machine goes on to execute
Merge Sort on this particular array with 4
[901]
elements 2, 4, 1 and 6. Now, once again, we
start at the first line in a new call to Merge
[907]
Sort for this new array. N is not less than
2, so we will not return and exit. This particular
[913]
condition is the base condition or the exit
condition from the recursion. If this was
[919]
not there, we would have gone endlessly in
recursion. We needed to stop somewhere. Once
[924]
again, we will create lefts and rights, and
once again there will be a recursive call
[929]
passing this array 2, 4 as argument. So, once
again, the state of execution of this second
[934]
Merge Sort will also be paused. All of these
are paused. The state of execution of the
[940]
function called with these arrays as arguments
are paused, this one is executing. Now, once
[947]
again, here also we will have a recursive
call. Now for this particular array with just
[951]
one element, this base condition will be true.
So, this call will simply exit. And now with
[957]
this guy, this array with two elements - 2
and 4, this second recursive call will be
[962]
made with just this element 4, which once
again is the base condition. And now, once
[968]
both the Merge Sort, both the recursive calls
for this particular sub list with two elements
[975]
return back, Merge function will be called.
And then this guy will finish. And then, the
[981]
control will return back for execution of
this particular sub list 2,4,1,6 and this
[987]
guy will make the second Merge Sort call.
This guy will first make a call for this sub
[993]
list with just one element and once this is
done, it will make another recursive call.
[998]
Now we will have a Merge for this guy 1, 6
and then control will return back here to
[1005]
this array 2,4,1,6 and Merge will be called
for this guy. And now once this guy 1,2,4,6
[1014]
will finish the control will return to the
function call corresponding to the original
[1020]
array and this guy will make another recursive
call, the second Merge Sort call. In actual
[1025]
implementation, we must make sure that all
these extra spaces and extra sub lists that
[1031]
we're creating should be deleted from the
memory once we're done using them. Like at
[1037]
this stage, we do not need all these arrays
with 1 and 2 elements in the memory. Now,
[1041]
for this guy 8,5,3,7 we will have a recursive
call passing 8,5 that will again make a recursive
[1049]
call with just one element 8 and then we will
also have a call to 5, that will just simply
[1056]
return. Once, 5, 8 returns, we will have call
for 3, 7 - 3 and 7 will simply Merge. And
[1066]
now, 5, 8 and 3, 7 will Merge and finally
when execution for 3,5,7,8 will finish we
[1073]
will have a call to Merge for the original
array or the initial array. So, this is Merge
[1080]
Sort Algorithm for you. At the start of the
lesson, we'd said that this is O (nlogn) in
[1086]
terms of time complexity. In our next lesson,
we will implement this algorithm, we will
[1090]
run some real code and we will also analyze
the time and space complexity of this algorithm.
[1097]
This is it for this lesson. Thanks for watching!
Most Recent Videos:
You can go back to the homepage right here: Homepage





