0

hello i was trying to read a line to array using ksh script . But some of its values are stored multiple times in adjacent array elements when there is a comma in the field value. How can this is avoided my delimiter is ~

input

17~4~~~char~Y~\"[_a-zA-Z0-9\.]\+@[a-zA-Z0-9]\+\.[a-z]\{2,\}\"~40~email id

code

while IFS= read -r line
  do
   
        if [ $n != 1 ]; then
                IFS="~"
  
                set -A star $line
                col_pos=${star[1]}
                col_patt=${star[6]}
                col_len=${star[7]}
                col_file_id=${star[0]}

value of $line

 17 4   char Y \"[_a-zA-Z0-9\.]\+@[a-zA-Z0-9]\+\.[a-z]\2\" \"[_a-zA-Z0-9\.]\+@[a-zA-Z0-9]\+\.[a-z]\\\" 40 email id

The issue is that pattern ("[_a-zA-Z0-9.]+@[a-zA-Z0-9]+.[a-z]\2") is duplicated while reading, but in actual input file the pattern field is defined once.

2 Answers 2

1

In:

set -A star $line

as $line is unquoted, you're invoking the split+glob operator. Here, you do want the split part, but not the glob part, so you should disable it first with:

set -o noglob

As @Isaac correctly identified, your issue here is not so much with globbing but with brace expansion that is done by ksh (and only ksh¹) in addition to globbing but is also disabled when globbing is disabled.

set -A array x y z was the ksh88 / pdksh way to set arrays as a whole. Newer versions of ksh including ksh93 and mksh have adopted the zsh-style array=(x y z).

With the set -A syntax, you need:

set -A array -- values

(except in zsh when not emulating ksh), as otherwise, it would not work properly if the first value started with - or +.

So:

set -o noglob; IFS='~'
while IFS= read -r line
do
  if [ "$n" != 1 ]; then
    set -A star -- $line
    col_pos=${star[1]}
    col_patt=${star[6]}
    col_len=${star[7]}
    col_file_id=${star[0]}

Though here, I would avoid all those problems and use this standard sh syntax:

while IFS='~' read -r col_file_id col_pos x x x x col_patt col_len x
do
  if [ "$n" != 1 ]; then

¹ unless POSIX mode is enabled in mksh and some recent versions of ksh93 or when brace expansion is disabled altogether at compile time

0

Issues:

  1. The script is incomplete (Missing done, missing fi, etc.).
  2. Variables should be quoted (always, by default).

But your main problem lies in the brace expression {2,} which will repeat the string before and end it in 2 and nothing when expanded in set -A star $line.

An unquoted $line will be subject to several expansions, one of them is brace expansion {,}.

To split the string, without expanding it unquoted, using ~ as delimiter and place the values in an array, this might work:

while IFS= read -r onestar; do star+=("$onestar"); done <<<"${line//~/$'\n'}"

Provided line is one line (contains no newlines), which it should, should it not?

2
  • i tried like set -A star "$line" but it is not splitting based on "~" star{0} is giving me value '17~1~~~char~Y~[0-9]+~10~number' Commented Jun 9, 2021 at 20:18
  • 1
    Splitting strings based on a delimiter is a quite difficult issue in shell (to be done right). I have edited a correct solution in my answer. @daturmgirl Commented Jun 9, 2021 at 20:30

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.