Thank you all for participating in discussion.
*** this is my home project to help my wife do extract data from research calculations /// speed up is around 400 times **
file used for extracting data from, contains around 2000 lines,
needed data blocks look like this
and they're repeated 10-20 times in the file.
uiyououy COORDINATES
NR ATOM CCCCC X Y Z
1 O1 8.00 0.000000000 0.882236820 -0.789494235
2 O2 8.00 0.000000000 -1.218250722 -1.644061652
3 O3 8.00 0.000000000 1.218328524 0.400260050
4 O4 8.00 0.000000000 -0.882314622 2.033295837
Text text text text
tons of text
to extract 4 lines I used expression below
grep -A4 --no-group-separator -e "1 O1" $from_file | cut -c23-64
>xyz_temp.txt
# grep 4 lines at once to txt
sed -i '/^[ \t]*$/d' xyz_temp.txt
#del empty lines from xyz txt
next is to convert string in to numbers (should use '| bc -l' for arithmetic)
while IFS= read line
do
IFS=' ' read -r -a arr_line <<< "$line"
# break line of xyz into 3 numbers
s1=$(echo "${arr_line[0]}" \* 0.529177249 | bc -l)
# some math convertion
s2=$(echo "${arr_line[1]}" \* 0.529177249 | bc -l)
s3=$(echo "${arr_line[2]}" \* 0.529177249 | bc -l)
#-------to array non sorted ------------
arr[$n]=${n}";"${from_file}";"${gd_}";"${frt[count_4s]}";"${n4}";"${s1}";"${s2}";"${s3}
echo ${arr[n]}
#--------------------------------------------
done <"$from_file_txt"
sort array
IFS=$'\n' sorted=($(sort -t \; -k4 -k5 -g <<<"${arr[*]}"))
# -t separator ';' -k column -g generic * to get new line output
#-k4 -k5 sort by column 4 then5
#printf "%s\n" "${sorted[*]}"
unset IFS
There is Last part which will combine data to result view
echo "$n"
n2=1
n42=1
count_4s2=1
i=0
echo "============================== sorted =============================="
################### loop for empty 4s lines
printf "%s" ";" ";" ";" ";" ";" "${count_4s2}" ";"
printf "%s\n"
printf "%s\n" "${sorted[i]}"
while [ $i -lt $((n-2)) ]
do
i=$((i+1))
if [ "$n42" = "4" ] # 1234
then n42=0
count_4s2=$((count_4s2+1))
printf "%s" ";" ";" ";" ";" ";" "${count_4s2}" ";"
printf "%s\n"
fi
#--------------------------------------------
n2=$((n2+1))
n42=$((n42+1))
printf "%s\n" "${sorted[i]}"
done ############# while
#00000000000000000000000000000000000000
printf "%s\n"
echo ==END===END===END==
Output looks like this
============================== sorted ==============================
;;;;;1;
17;A-13_A1+.out;1.3;0.4;1;0;.221176355474853043;-.523049776514580244
18;A-13_A1+.out;1.3;0.4;2;0;-.550350051428402955;-.734584881824005358
19;A-13_A1+.out;1.3;0.4;3;0;.665269869069959489;.133910683627893251
20;A-13_A1+.out;1.3;0.4;4;0;-.336096173116409577;1.123723974181515102
;;;;;2;
13;A-13_A1+.out;1.3;0.45;1;0;.279265277182782148;-.504490787956469897
14;A-13_A1+.out;1.3;0.45;2;0;-.583907412327951988;-.759310392973448167
15;A-13_A1+.out;1.3;0.45;3;0;.662538493711206290;.146829200993661293
16;A-13_A1+.out;1.3;0.45;4;0;-.357896358566036450;1.116971979936256771
;;;;;3;
9;A-13_A1+.out;1.3;0.5;1;0;.339333719743262501;-.482029749553797105
10;A-13_A1+.out;1.3;0.5;2;0;-.612395507070451545;-.788968880150283253
11;A-13_A1+.out;1.3;0.5;3;0;.658674809217196345;.163289820251690233
12;A-13_A1+.out;1.3;0.5;4;0;-.385613021360830052;1.107708808923212876
==END===END===END==
*note : some code might not shown here
next step is to paste it to excel with ; separator.
grepcommand to know where a previousgrepcommand found a match. They don't communicate (and even if they could, not searching from the beginning of the file if you have grepped the same file before would be highly annoying most of the time).