There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).
Java Solution
This problem can be converted to the problem of finding kth element, k is (A’s length + B’ Length)/2.
If any of the two arrays is empty, then the kth element is the non-empty array’s kth element. If k == 0, the kth element is the first element of A or B.
For normal cases(all other cases), we need to move the pointer at the pace of half of the array size to get O(log(n)) time.
public double findMedianSortedArrays(int[] nums1, int[] nums2) { int total = nums1.length+nums2.length; if(total%2==0){ return (getKth(nums1, 0, nums1.length-1, nums2, 0, nums2.length-1, total/2) + getKth(nums1, 0, nums1.length-1, nums2, 0, nums2.length-1, total/2-1))/2.0; }else{ return getKth(nums1,0, nums1.length-1, nums2, 0, nums2.length-1, total/2); } } //k is the index starting from 0 private int getKth(int[] nums1, int i1, int j1, int[] nums2, int i2, int j2, int k){ if(j1<i1){ return nums2[i2+k]; } if(j2<i2){ return nums1[i1+k]; } if(k==0){ return Math.min(nums1[i1], nums2[i2]); } int len1 = j1 - i1 + 1; int len2 = j2 - i2 + 1; int m1 = k*len1/(len1+len2); int m2 = k - m1 - 1; m1 += i1; m2 += i2; if(nums1[m1]<nums2[m2]){ k = k-(m1-i1+1); j2 = m2; i1 = m1+1; }else{ k = k-(m2-i2+1); j1 = m1; i2 = m2+1; } return getKth(nums1, i1, j1, nums2, i2, j2, k); } |
The main challenge is to calculate the middle elements, we can not do the following like a regular binary search:
int m1 = i1+(j1-i1)/2; int m2 = i2+(j2-i2)/2; |
It will result in either dead loop or missing the element at the beginning. The key is we always drop <= half size of the elements.
Nice version, however my head hurts when I try to understand it.
So, this is what my brain produced:
public static double findMedianSortedArrays(int[] a, int[] b) {
int len = a.length + b.length;
if (len % 2 != 0)
return findKth(a, 0, b, 0, len/2);
else {
double median1 = findKth(a, 0, b, 0, (len-1)/2);
double median2 = findKth(a, 0, b, 0, len/2);
return (median1+median2)/2;
}
}
public static double findKth(int[] a, int sa, int[] b, int sb, int k) {
// base cases
if (sa == a.length)
return b[sb];
else if (sb == b.length)
return a[sa];
else if (k == 0)
return Math.min(a[sa], b[sb]);
int ka = k/2;
int kb = k - ka - 1;
if (a[sa+ka] >= b[sb+kb])
return findKth(a, sa, b, sb+kb+1, ka);
return findKth(a, sa+ka+1, b, sb, kb);
}
No extra space allowed…
Can someone explain solution1..how are we dropping the elements?
Complexity is not correct. If k==1 then this O( max(M,N))
What about using a min and a max heap here ? similar to finding the mean of streaming data ?
For 1st solution, it prints 7 which is correct !!
int[] nums1 = new int[]{1,2, 7, 8};
int[] nums2 = new int[]{3, 4, 5, 6, 9, 10, 11, 12, 13};
System.out.println(medianTwoSortedArrays.findMedianSortedArrays(nums1, nums2));
Prints 9.
How is any one of the solutions correct. This makes no sense.
Why do you initiate aMid as aMid = aLen * k / (aLen + bLen) ?? What’s the point??
>> Answer is this post:
http://articles.leetcode.com/find-k-th-smallest-element-in-union-of/
If the two arrays are of length ‘n’, the complexity would be O(n), in that case.
The above solution has a better complexity
(1) aMid = aLen / 2, (2) k = (aLen + bLen) / 2
2k = (aLen + bLen) which means (3) 2 = (aLen + bLen) / k
then, just substitute (3) to (1)
I used using an array and storing till (n+m)/2. Is this fine.
public class Solution {
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
int le=nums1.length+nums2.length;
int l =(le)/2;
int[] arr =new int[l+1];
int j=0;
int k=0;
for(int i=0;i=nums1.length || k>=nums2.length){
if(j>=nums1.length){
arr[i]=nums2[k];
k++;
}else{
arr[i]=nums1[j];
j++;
}
}else{
if(nums1[j]nums2[k]){
arr[i]=nums2[k];
k++;
}else{
arr[i]=nums2[k];
if(i+1<=l){
i++;
arr[i]=nums1[j];
j++;
}
k++;
}
}
}
if(le%2==0){
return (arr[l]+arr[l-1])/2.0;
}else
return arr[l];
}
}
I’m not seeing the connection between the two formulas…and how those two is used to derive the other.
aMid = aLen * k/(aLen + bLen)
from
aMid = aLen / 2 and k = (aLen + bLen)/2
at best you get:
aMid = aLen / 2 = (2k-bLen)/2 which is not aLen * k/(aLen + bLen)
Can you expound on this?
Thank you. This is the best solution among all online!
Step 5 is correct if and only if length of both array is the same. Only in this occasion, median of two subarrays equal to the median of two original arrays.
solution 2 when N>M
public class Solution {
public double findMedianSortedArrays(int[] nums1, int[] nums2) {
if(nums1.length>nums2.length){
int[] tem= nums1;
nums1=nums2;
nums2=tem;
}
int s1 = nums1.length;
int s2 = nums2.length;
if(s1==0&&s2==0) return 0;
int m1=(s1-1)/2;
int m2=(s2-1)/2;
if(s1==0){
double median= (s2%2==0)? (nums2[m2]+nums2[m2+1])/2.0: nums2[m2];
return median;
}
double mv1 =0;
double mv2=0;
mv1= (s1%2==0)? (nums1[m1]+nums1[m1+1])/2.0: nums1[m1];
mv2= (s2%2==0)? (nums2[m2]+nums2[m2+1])/2.0: nums2[m2];
if (mv1==mv2) return mv1;
if(s1==1) {
if(s2%2==0){
if(nums1[0]nums2[m2+1]) return nums2[m2+1];
else return nums1[0];
}
}else{
if (s2==1) return (nums1[0]+nums2[0])/2.0;
if(nums1[0]<nums2[m2]){
if (nums1[0]nums2[m2+1]) return (nums2[m2+1]+nums2[m2])/2.0;
else return (nums1[0]+nums2[m2])/2.0;
}
}
}
int s=s1+s2;
if (s % 2 == 0) {
if(s1==s2)
return findMedianSortedArrays(nums1,0,s1-1,nums2,0,s1-1,false);
else{
if(nums1[0]>=nums2[(s2+s1)/2]) return (nums2[(s2+s1)/2-1]+nums2[(s2+s1)/2])/2.0;
if(nums1[s1-1]nums1[s1-1])return nums2[(s2-s1)/2];
return findMedianSortedArrays(nums1,0,s1-1,nums2,(s2-s1)/2+1,(s2+s1)/2,true);
}
}
public double findMedianSortedArrays(int[] n1, int nb1,int ne1,int[] n2,int nb2, int ne2, boolean odd) {
int s = ne1 - nb1 + 1 ;
if(s==2){
if (odd) return Integer.min(Integer.max(n1[nb1],n2[nb2]),Integer.min(n1[ne1],n2[ne2]));
return (Integer.max(n1[nb1],n2[nb2])+Integer.min(n1[ne1],n2[ne2]))/2.0;
}
int mid = (s-1)/2;
if (n1[nb1+mid]==n2[nb2+mid]) return n1[nb1+mid];
if (n1[nb1+mid]>n2[nb2+mid]){
if(s%2==0) ne1=nb1+mid+1;
else ne1=nb1+mid;
return findMedianSortedArrays(n1,nb1,ne1,n2,nb2+ mid,ne2,odd);
}
if(s%2==0) ne2=nb2+mid+1;
else ne2=nb2+mid;
return findMedianSortedArrays(n1,nb1+mid,ne1,n2,nb2,ne2,odd);
}
}
when N>M ,tail N to M. then solution 2 O(log m)
solution 2 when nM
public static double findMedianSortedArrays(int[] nums1, int[] nums2) {
int s1 = nums1.length;
int s2 = nums2.length;
if(s1==0&&s2==0) return 0;
if(nums1.length>nums2.length){
int[] tem= nums1;
nums1=nums2;
nums2=tem;
}
int m1=(s1-1)/2;
int m2=(s2-1)/2;
if(s1==0)
return (s2%2==0)? (nums2[m2]+nums2[m2+1])/2.0: nums2[m2];
double mv1 =0;
double mv2=0;
mv1= (s1%2==0)? (nums1[m1]+nums1[m1+1])/2.0: nums1[m1];
mv2= (s2%2==0)? (nums2[m2]+nums2[m2+1])/2.0: nums2[m2];
if (mv1==mv2) return mv1;
if (s2==1&&s1==1) return (nums1[0]+nums2[0])/2.0;
if(s1==1) {
if(s2%2==0){
if(nums1[0]nums2[m2+1]) return nums2[m2+1];
else return nums1[0];
}
}else{
if(nums1[0]<nums2[m2]){
if (nums1[0]nums2[m2+1]) return (nums2[m2+1]+nums2[m2])/2.0;
else return (nums1[0]+nums2[m2])/2.0;
}
}
}
int s=s1+s2;
if (s % 2 == 0) {
if(s1==s2)
return findMedianSortedArrays(nums1,0,s1-1,nums2,0,s1-1,false);
else{
if(nums1[0]>=nums2[(s2+s1)/2]) return (nums2[(s2+s1)/2-1]+nums2[(s2+s1)/2])/2.0;
if(nums1[s1-1]nums1[s1-1])return nums2[(s2-s1)/2];
return findMedianSortedArrays(nums1,0,s1-1,nums2,(s2-s1)/2+1,(s2+s1)/2,true);
}
}
public static double findMedianSortedArrays(int[] n1, int nb1,int ne1,int[] n2,int nb2, int ne2, boolean odd) {
int s = ne1 – nb1 + 1 ;
if(s==2){
if (odd) return Integer.min(Integer.max(n1[nb1],n2[nb2]),Integer.min(n1[ne1],n2[ne2]));
return (Integer.max(n1[nb1],n2[nb2])+Integer.min(n1[ne1],n2[ne2]))/2.0;
}
int mid = (s-1)/2;
if (n1[nb1+mid]==n2[nb2+mid]) return n1[nb1+mid];
if (n1[nb1+mid]>n2[nb2+mid]){
if(s%2==0) ne1=nb1+mid+1;
else ne1=nb1+mid;
return findMedianSortedArrays(n1,nb1,ne1,n2,nb2+mid,ne2,odd);
}
if(s%2==0) ne2=nb2+mid+1;
else ne2=nb2+mid;
return findMedianSortedArrays(n1,nb1+mid,ne1,n2,nb2,ne2,odd);
}
Thanks very much for the explanation. You are a lifesaver. Can you please also explain the logic for int bMid = k – aMid – 1; // b’s middle count
in this case a={2,3,4} b={1},i think the answer can not work.
No. It won’t work if the length of the 2 arrays is different.
implementation at:
“http://www.geeksforgeeks.org/median-of-two-sorted-arrays/”
In Java, 1/2*3=0 and 1*3/2=1. Reason is that in the first expression, 1/2 is evaluated as 0.
I am not sure why it says index out of bound when I put (aLen/(aLen+bLen)*k) instead of (aLen*k/(aLen+bLen)). Anyone can explain?
Thank you for the great code. Especially for sweet variables’ names 🙂
(However, explanation of algorithm by Gunner86 is not related to code!)
Could you please explain those initiating aMid and bMid? I cannot get it.
Is that wrong to put:
aMid = aLen / 2 and bMid = bLen / 2
Do we get wrong output?
good idea ! but there is a problem. if we find the kth-1 value, we still can’t decide the kth value in which array.
because A[aMid] still have possibility to be the kth number. so should include it.
well done ! but there is a little problem with u method. if aLen=7, bLen=16, k=7, u can get aMid=0. i think you’d better cast the aLen to type double first. this is to say, you’d better write like this
aMid = (int) ( (double)aLen / (aLen + bLen) * k)
Because in the original post, function findKth is a general function to find kth number in two sorted arrays, not only median of two sorted arrays.
I suppose your calculation is not right. the result of this algorithm is 6.
if k==m+n, the program can recursively hit the condition k==0
scwinji is right. you can read his answer.
your analysis is more general and awesome and i think you are right. but there is one little problem in your answer. I suppose “If A[aMid] is less than B[bMid], ” shoule be “If A[aMid] is greater than B[bMid], “. anyway , thank you very much.
I love this site’s solutions, they are so clean and easy to understand! 🙂
you can do like this: aMid=(int)(1.0*aLen/(aLen+bLen)*k);
int aMid = aLen * k / (aLen + bLen);
why this magic formula? I tried others, but only this one works, whY?
when you arrange the two array it would be 25, 28 ,30 ,31. So the median would be average of 28 and 30. 🙂
Right, to use the code’s terms, aMid + bMid + 1 = k must be satisfied to be able to make the conclusions it does when A[aMid] > B[bMid] (expanded on later). They also must all be integers >= 0. That line is just how X Wang decided to determine these values, and it has certain runtime properties associated with it. You *could* replace the lines with randomly generating these numbers within bounds:
int aMid, bMid;
Random r = new Random();
if (aLen > bLen) {
bMid = (int)(Math.min(bLen,k) * r.nextDouble());
aMid = k – bMid – 1;
} else {
aMid = (int)(Math.min(aLen,k) * r.nextDouble());
bMid = k – aMid – 1;
}
… which would still get you the right answer, but have pretty different runtime properties.
As for why aMid + bMid + 1 = k is significant: If A[aMid] is less than B[bMid], you know that any elements in after A[aMid] in A can’t be the kth element since there are too many elements in B lower than it (and would exceed k elements). You also know that B[bMid] and any element before B[bMid] in B can’t be the kth element since there are too few elements in A lower than it (there wouldn’t be enough elements before B[bMid] to be the kth element).
It’s really odd how this was picked out as the pseudocode to X Wang’s solution, but it doesn’t say anything about the kth element. This algorithm doesn’t have a solution for when an array is of size 1, and also produces an incorrect solution for inputs ar1=[1,2,3] and ar2=[100,100]. On step 4, you get to a point where ar1=[2,3] and ar2=[100,100], and then step 6 calculates the median as (max(2,100) + min(3,100))/2 = 51.5, when it should be actually 3. Sorry, but this is not the same algorithm as the original post.
Which means that you can just do aMid = aLen / 2
Also, why int aMid = aLen * k / (aLen + bLen); // a’s middle count
int bMid = k – aMid – 1; // b’s middle count to calculate mid element. Why not aMid = aLen / 2 and bMid = bLen / 2
I think when k == m + n is also a special case that should be handled similar to k == 0
considering [1,2,3] and [4,5,6,7,8,9,10,11]
the results of this algorithm is 3.5 where the real median is 6. Am i missing something here or is this solution only applys when the two list size are the same?
What if the two subarrays in the end are {25, 30} and {28, 31}? Here the median should have been average of 25 and 28. This is not attainable going by the logic mentioned by you. Can you please put some light on it?
Why is this inclusive? Could any one elaborate on this?
aEnd = aMid; // not aEnd = aMid – 1?
same question here, can anyone give a better answer?
in the even case you have to run the algo twice. If your findkth func just returns the index and the value, then you can just find the kth -1 value to then find the average without searching for the next value
If you merge them and find the median element then time complexity would jump to O(n+m).
大牛
Hmm, not really. It should be okay if the length for both arrays or both sub-arrays are different. Open to discuss:)
aMid = aLen / 2 and k = (aLen + bLen)/2, so aMid = aLen * k/(aLen + bLen)
I don’t think it matters as long as the invariant i + j = k – 1 is satisfied.
The reason is it might be able to guess the kth element quicker if you weight the arrays.
You have to do it in O(log(length of A + length of B))
I might be missing something but since A and B are two sorted arrays, why not just merge them and return the middle element?
This is the best solution I have ever seen, simple, efficient and elegant. But I spend a lot of time understanding initiating aMid and bMid. Thank you for sharing this!
Can you guys tell me if this one is OK ?
public static void main(String[] args) {
int[] arr1 = { 1, 3, 4, 7, 8, 11, 44, 55, 62 };
int[] arr2 = { 2, 4, 5, 7, 33, 56, 77 };
double median = getMedian(arr1, arr2);
System.out.println(“calculated Median : ” + median);
}
private static double getMedian(int[] arr1, int[] arr2) {
int[] medianIndices = getMedianIndices(arr1.length, arr2.length);
double median = 0;
int currIndx1 = 0, currIndx2 = 0, currArr = 0;
if (medianIndices.length == 2) {
for (int i = 0; i <= medianIndices[1]; i++) {
if (arr1[currIndx1] < arr2[currIndx2])
currArr = 1;
else
currArr = 2;
if (i == medianIndices[0] || i == medianIndices[1]) {
if (currArr == 1)
median += arr1[currIndx1];
else
median += arr2[currIndx2];
}
if (currArr == 1)
currIndx1++;
else
currIndx2++;
}
median = median / 2;
} else {
for (int i = 0; i <= medianIndices[0]; i++) {
if (arr1[currIndx1] < arr2[currIndx2])
currArr = 1;
else
currArr = 2;
if (i == medianIndices[0]) {
if (currArr == 1)
median += arr1[currIndx1];
else
median += arr2[currIndx2];
}
if (currArr == 1)
currIndx1++;
else
currIndx2++;
}
}
return median;
}
private static int[] getMedianIndices(int l1, int l2) {
int[] medianIndices;
if ((l1 + l2) % 2 == 0) {
medianIndices = new int[2];
medianIndices[0] = (l1 + l2) / 2 – 1;
medianIndices[1] = (l1 + l2) / 2;
} else {
medianIndices = new int[1];
medianIndices[0] = (l1 + l2) / 2;
}
return medianIndices;
}
can anyone answer it?
To use this algorithm, Array A and B need to be the same length
Perhaps the logic will help to understand it better..
Algorithm:
1) Calculate the medians m1 and m2 of the input arrays ar1[]
and ar2[] respectively.
2) If m1 and m2 both are equal then we are done.
return m1 (or m2)
3) If m1 is greater than m2, then median is present in one
of the below two subarrays.
a) From first element of ar1 to m1 (ar1[0…|_n/2_|])
b) From m2 to last element of ar2 (ar2[|_n/2_|…n-1])
4) If m2 is greater than m1, then median is present in one
of the below two subarrays.
a) From m1 to last element of ar1 (ar1[|_n/2_|…n-1])
b) From first element of ar2 to m2 (ar2[0…|_n/2_|])
5) Repeat the above process until size of both the subarrays
becomes 2.
6) If size of the two arrays is 2 then use below formula to get
the median.
Median = (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1]))/2
Thanks
Kartrace
points out
do the ratio first.
aMid = aLen / (aLen + bLen) * k
NO!!! do the * first. or you will fail. e.g {2, 3, 4} and {1}.
3/4*2=0 3*2/4=1.
do the ratio first.
aMid = aLen / (aLen + bLen) * k
Why do you initiate aMid as aMid = aLen * k / (aLen + bLen) ?? What’s the point??
è¿‡å¥–äº†ï¼ŒåŠ æ²¹ã€‚
ä½ çš„ç®—æ³•å¤ªåŽ‰å®³äº†~æ£åœ¨åˆ·leetcode
the edge conditions are incorrect start + k is not the kth element. For Example when start==0 and we are looking to get 8th element, we would end up with array[8] whereas we need array[7] (th: // Handle special cases
if (aLen == 0)
return B[bStart + k];
if (bLen == 0)
return A[aStart + k];
I have never used java until today but when I tried this code, I noticed that this line:
int aMid = aLen * k / (aLen + bLen);
overflows very easily when aLen, k > 65536
I used a very ugly hack:
int aMid = k * 4 / (aLen + bLen) * aLen / 4
But what is the best practice?