Dealing with the followin function of CSV processing in bash:
sort_data () {
for csv in "${rescore}"/${str_name}/*.csv; do
csv_name=$(basename "$csv" .csv)
if [ "${MY_SORT_METHOD}" = "2" ]; then
LC_ALL=C sort -k2,2g ${csv} > "${rescore}"/${str_name}/${csv_name}_std.csv
# run awk script to take 5% of data
awk -v lines="$(wc -l < "${rescore}"/${str_name}/${csv_name}_std.csv)" '
BEGIN{
top=int(lines/20)
}
FNR>(top){exit}
1
' "${rescore}"/${str_name}/${csv_name}_std.csv >> "${rescore}"/${str_name}/${csv_name}_TOP.csv
# remove input csv with all lines
rm "${rescore}"/${str_name}/${csv_name}_std.csv
else
echo "Debug: data was not sorted correctly!"
fi
rm $csv
done
}
I am looking for the possibility to fix the representation of AWK part of the function, which should be inside of the provided IF condition (vizually like in python!) Is it possible using Visual Code Studio to select a part of the code and shift this part entirely on the rignt on the selected numbers of tabs?
$rescorevariable? You should be quoting all of them. e.g. don't do"${rescore}"/${str_name}/${csv_name}_std.csv. Instead do:"${rescore}/${str_name}/${csv_name}_std.csv". and you should definitely be quoting$csvin therm $csvline. Double-quote variables every time, except when you explicitly want shell word-splitting to occur (and even then, you should consider using an array instead).