I need your help in reading text file in php the first part of the file consists of variables and the second part consist of data as multiple rows, each row limits is 79.
I want to read the data and store them in mysql db.
The file is epiData rec file
The file structure as below:
44 1
_COUNTRY 1 3 30 57 3 3 3 112 COUNTRY Country ...............................
_IDCODE 1 4 30 57 4 3 20 112 IDCODE EPID No................................
#HOTCASE 1 5 30 57 5 0 1 112 HOTCASE Hot case...............................
_DISTRICT 1 6 30 57 6 3 20 112 DISTRICT District...............................
_PROVINCE 1 7 30 57 7 3 20 112 PROVINCE Province...............................
_DOB 1 8 30 57 8 11 10 112 DOB Date of birth..........................
#AGE 1 9 30 57 9 0 3 112 AGE Age (in months)........................
#SEX 1 10 30 57 10 0 1 112 SEX Sex....................................
_DONSET 1 11 30 57 11 11 10 112 DONSET Date of onset of paralysis.............
_DNOT 1 12 30 57 12 11 10 112 DNOT Date of notification...................
_DOI 1 13 30 57 13 11 10 112 DOI Date of case investigation.............
_DSTCOLL1 1 14 30 57 14 11 10 112 DSTCOLL1 Date stool collected:1.................
_DSTCOLL2 1 15 30 57 15 11 10 112 DSTCOLL2 Date stool collected:2 ................
#DOSESR 1 16 30 57 16 0 1 112 DOSESR Routine doses of OPV...................
#DOSESN 1 17 30 57 17 0 2 112 DOSESN Doses of OPV during NID/SIA............
#DOSES 1 18 30 57 18 0 2 112 DOSES Total polio doses......................
_DLOPV 1 19 30 57 19 11 10 112 DLOPV Date of last OPV.......................
#FEVER 1 20 30 57 20 0 1 112 FEVER Fever..................................
#PROGRESS 1 21 30 57 21 0 1 112 PROGRESS Progression............................
#ASYM 1 22 30 57 22 0 1 112 ASYM Asymmetric paralysis...................
_DFUP 1 23 30 57 23 11 10 112 DFUP Date of follow-up......................
#FUP 1 24 30 57 24 0 1 112 FUP Findings at follow-up..................
_DSTLAB 1 25 30 57 25 11 10 112 DSTLAB Date stool(s) received in lab..........
_DTRES 1 26 30 57 26 11 10 112 DTRES Date prelim results received by EPI....
_DIRES 1 27 30 57 27 11 10 112 DIRES Date ITD results received by EPI.......
#STCOND 1 28 30 57 28 0 1 112 STCOND Stool condition... ....................
#L20B 1 29 30 57 29 0 1 112 L20B L20B isolated..........................
#P1 1 30 30 57 30 0 1 112 P1 P1 (lab results).......................
#P2 1 31 30 57 31 0 1 112 P2 P2 (lab results).......................
#P3 1 32 30 57 32 0 1 112 P3 P3 (lab results).......................
#ENTERO 1 33 30 57 33 0 1 112 ENTERO Entero (lab results)...................
#CLASS 1 34 30 57 34 0 1 112 CLASS Classification.........................
#FDIAG 1 35 30 57 35 0 1 112 FDIAG Final diagnosis... ....................
_OTHDIAG 1 36 30 57 36 3 6 112 OTHDIAG Diagnosis (if FDIAG=Other).............
_SDIAG 1 37 30 57 37 1 40 112 SDIAG Specify diagnosis (if OTHDIAG=Other)...
#CONTACT 1 38 30 57 38 0 1 112 CONTACT Number of contacts.....................
#ELIGCONT 1 39 30 57 39 0 1 112 ELIGCONT AFP case eligible for contacts.........
#INADAFP 1 40 30 57 40 0 1 112 INADAFP Reason for contact - inadequate........
#HOTAFP 1 41 30 57 41 0 1 112 HOTAFP Reason for contact - hot AFP...........
#HARDAREA 1 42 30 57 42 0 1 112 HARDAREA Reason for contact - area..............
#OTHREAS 1 43 30 57 43 0 1 112 OTHREAS Reason for contact - other.............
_SOTHREAS 1 44 30 57 44 1 30 112 SOTHREAS Other reason, specify..................
#WILDCONT 1 45 30 57 45 0 1 112 WILDCONT Wild poliovirus from contacts..........
#VDPVCONT 1 46 30 57 46 0 1 112 VDPVCONT VDPV isolated from contacts............
AFGAFG/06/06/725 2KABUL KABUL 242!
18/09/2006 22/09/200623/09/20060 9918/09/2006 !
26/09/2006 12444134 !
!
AFGAFG/05/11/370 2MUSAYI KABUL 602!
22/11/2014 25/11/201427/11/20140 9927/10/2014 !
29/11/201411/12/2014 12444234 !
!
AFGAFG/05/07/101 2BAMYAN BAMYAN 9001!
0 9905/08/2007 !
!
!
AFGAFG/05/17/005 2SAYDABAD WARDAK 541!
02/01/201704/01/201704/01/201704/01/201705/01/20175131818/10/2016111 !
10/01/201721/01/2017 12444235G04 !
!
AFGAFG/05/17/007 1WARAS BAMYAN 61!
01/01/201704/01/201704/01/201704/01/201705/01/20174 3 718/10/2016111 !
10/01/201721/01/2017 12444235B34 !
!
AFGAFG/05/17/002 2KABUL KABUL 181!
01/01/201701/01/201701/01/201702/01/201707/01/20175 61117/10/2016111 !
07/01/201719/01/2017 12444235G81 !
!
AFGAFG/05/17/003 2SHEKHALI PARWAN 441!
01/01/201703/01/201703/01/201703/01/201705/01/20174141816/10/2016112 !
07/01/201719/01/2017 12444235E87.6 !
!
AFGAFG/05/17/008 2NERKH WARDAK 482!
03/01/201704/01/201704/01/201705/01/201706/01/20175121718/10/2016111 !
10/01/201721/01/2017 12444235B34 !
!
AFGAFG/05/17/001 2KHENJ (HES-E- AWAL) PANJSHER 142!
01/01/201702/01/201702/01/201702/01/201703/01/20175 4 917/10/2016111 !
05/01/201716/01/2017 12444235B34 !
!
AFGAFG/05/17/004 2KABUL KABUL 362!
01/01/201702/01/201702/01/201702/01/201704/01/20175 71214/12/2016111 !
06/01/201717/01/2017 12444235B34 !
!
The first line in the file lists the number of variables 44, and the 8th column is the length for each variable (before 112 column) I managed to read the variables and put them in an array but I face a problem how to read the data for each variable.
I will show how I accomplished that:
<?php
$file_name = 'CTRYAFP10.rec';
if (file_exists($file_name)) {
$file = fopen($file_name, "r");
$first_row = fgets($file);
//I used fgets() to read the file row by row
$first_row_array = explode(" ", trim($first_row));
$numberOfVariables = intval($first_row_array[0]);
$total_length_for_all_varibles = 0;
$number_of_data_rows = 0;
$j = 0;
$heads = array();
$last_end = null;
for ($i = 0; $i < $numberOfVariables; $i++) {
$result = fgets($file);
$variable_name = strtolower(trim(substr($result, 1, 11)));
$current_item_length = intval(trim(substr($result, 36, 4)));
$total_length_for_all_varibles += $current_item_length;
$last_end = $current_item_length + intval($last_end);
if ($current_item_length > 0) {
if ($i === 0) {
//first loop
$heads[$i]['start'] = 0;//variable starts at position 0 of the row
} else {
$prev_start = $heads[$i - 1]['start'];
$prev_item_length = $heads[$i - 1]['field_length'];
$x = $prev_start + $prev_item_length;
$data_row_limit = 79 - $x;//the limits of each data row is 79
if($data_row_limit > $current_item_length){
$heads[$i]['start'] = $x;
} else {
$heads[$i]['start'] = 0;
}
}
$heads[$i]['field_length'] = $current_item_length;
$heads[$i]['variable_name'] = strtolower(trim($variable_name));
}//end if length > 0
}//end for loop for getting the variables
$number_of_data_rows = ceil($total_length_for_all_varibles/79);// in this case 4
$total_length_for_all_varibles += $number_of_data_rows;//in this case its 283
//while (!feof($file)){
$data = '';
$insert_data = array();
for($i = 1; $i <= $number_of_data_rows; $i++){
$data .= fgets($file);
}
/*
The output for the previous loop which represents (the data/empty value) for the 44 variables
AFGAFG/06/06/725 2KABUL KABUL 242!
18/09/2006 22/09/200623/09/20060 9918/09/2006 !
26/09/2006 12444134 !
!
*/
foreach ($heads as $key => $val){
$item_val = trim(substr($data, $val['start'], $val['field_length']));
$insert_data[$val['variable_name']] = $item_val;
}
var_dump($insert_data);exit();
//}
}
The output for the above code is, it reads till ["sex"] variable correct values but it doesn't continue to the second row after 79 charachter popsition:
array(44) {
["country"]=>
string(3) "AFG"
["idcode"]=>
string(13) "AFG/06/06/725"
["hotcase"]=>
string(1) "2"
["district"]=>
string(5) "KABUL"
["province"]=>
string(5) "KABUL"
["dob"]=>
string(0) ""
["age"]=>
string(2) "24"
["sex"]=>
string(1) "2"
["donset"]=>
string(10) "AFGAFG/06/"
["dnot"]=>
string(6) "06/725"
["doi"]=>
string(6) "2KABUL"
["dstcoll1"]=>
string(0) ""
["dstcoll2"]=>
string(5) "KABUL"
["dosesr"]=>
string(0) ""
["dosesn"]=>
string(0) ""
["doses"]=>
string(0) ""
["dlopv"]=>
string(0) ""
["fever"]=>
string(0) ""
["progress"]=>
string(0) ""
["asym"]=>
string(0) ""
["dfup"]=>
string(3) "242"
["fup"]=>
string(1) "A"
["dstlab"]=>
string(10) "FGAFG/06/0"
["dtres"]=>
string(5) "6/725"
["dires"]=>
string(6) "2KABUL"
["stcond"]=>
string(0) ""
["l20b"]=>
string(0) ""
["p1"]=>
string(0) ""
["p2"]=>
string(0) ""
["p3"]=>
string(0) ""
["entero"]=>
string(0) ""
["class"]=>
string(0) ""
["fdiag"]=>
string(0) ""
["othdiag"]=>
string(1) "K"
["sdiag"]=>
string(29) "AFGAFG/06/06/725 2KABUL"
["contact"]=>
string(0) ""
["eligcont"]=>
string(0) ""
["inadafp"]=>
string(0) ""
["hotafp"]=>
string(0) ""
["hardarea"]=>
string(1) "K"
["othreas"]=>
string(1) "A"
["sothreas"]=>
string(30) "BUL 2"
["wildcont"]=>
string(1) "4"
["vdpvcont"]=>
string(1) "2"
}
fgetcsvfor that..recand this is just a sample the actual files consists of thousands of rows.