I would appreciate any help, given.
I have 7 separate arrays with approx. 90,000 numbers in each array (let's call them arrays1-arrays7). There are no duplicate numbers within each array itself. BUT, there can be duplicates between the arrays. for example, array2 has no duplicates but it is possible to have numbers in common with arrays3 and arrays4.
The Problem: I am trying to identify all of the numbers that are duplicated 3 times once all 7 arrays are merged.
I must do this calculation 1000 times and it takes 15 mins but that is not ok because I have to run it 40 times -- The code:
if you know of another language that is best suited for this type of calculation please let me know. any extension suggestions such as redis or gearman are helpful.
for($kj=1; $kj<=1000; $kj++)
{
$result=array_merge($files_array1,$files_array2,$files_array3,$files_array4,$files_array5,$files_array6,$files_array7);
$result=array_count_values($result);
$fp_lines = fopen("equalTo3.txt", "w");
foreach($result as $key => $val)
{
if($result[$key]==3)
{
fwrite($fp_lines, $key."\r\n");
}
}
fclose($fp_lines);
}
i have also tried the code below with strings but the array_map call and the array_count values call take 17 mins:
for($kj=1; $kj<=1000; $kj++)
{
$result='';
for ($ii = 0; $ii< 7; $ii++) {
$result .= $files_array[$hello_won[$ii]].'\r\n';
}
$result2=explode("\n",$result);//5mins
$result2=array_map("trim",$result2);//11mins
$result2=array_count_values($result2);//4-6mins
$fp_lines = fopen("equalTo3.txt", "w");
foreach($result2 as $key => $val)
{
if($result2[$key]==3)
{
fwrite($fp_lines, $key."\r\n");
}
}
fclose($fp_lines);
unset($result2);
34, 47, 21or337894294529243592, 38439434949238, 39859893922242? What percentage of the combined 7 arrays would you guess are duplicates? triplicates, etc.? And what do you mean by needing to do this 1,000 times, and 40 times?