I have an html table in which the first row is the title and the next rows represent the body of the table. I want to extract the values from the 3'rd column of each row. How can I proceed?
1 Answer
Try the below awk command,
awk 'NR>1{print $3}' file
This prints the value of third column except the one in the header.
Update:
awk -v RS='</tr>' -v F='<td>' '{$3=gsub(/<[^<>]*>/,"",$3);print $3}' file
4 Comments
Ciprian Vintea
No. It doesn't work. I think a delimiter (<td>) should be used in this case.
Avinash Raj
could you provide an example along with expected output?
Ciprian Vintea
awk -v RS='</tr>' -v F='<td>' '{print $3}' - this will print <td>value</td>. How can I extract the value from here?
Avinash Raj
awk -v RS='</tr>' -v F='<td>' '{$3=gsub(/<[^<>]*>/,"",$3);print $3}' file