I need to parse a large C source code to extract all structure definitions, typical format is
typedef struct structure1 {
field1;
field2;
.....
structure2 new_strut;
};
struct structure2 {
field3;
}my_struct;
How can I extract these structures?
I need to parse a large C source code to extract all structure definitions, typical format is
typedef struct structure1 {
field1;
field2;
.....
structure2 new_strut;
};
struct structure2 {
field3;
}my_struct;
How can I extract these structures?
awk is a fairly good fit for the job:
awk '
BEGIN { in_struct=0; }
/^(typedef )?struct .*/ { in_struct=1; }
/^}/ && in_struct { print; in_struct=0; }
in_struct == 1 { print; }
'
However, you could also do it in native bash with no external tools whatsoever:
#!/bin/bash
# ^^^^- bash, not /bin/sh
struct_start_re='^(typedef )?struct '
struct_end_re='^}'
filter_for_structs() {
in_struct=0
while IFS= read -r line; do
[[ $line =~ $struct_start_re ]] && in_struct=1
if (( in_struct )); then
printf '%s\n' "$line"
[[ $line =~ $struct_end_re ]] && in_struct=0
fi
done
}
...used akin to the following:
cat *.[ch] | filter_for_structs
gcc or similar and normalize the code layout using indent or similar first, e.g. sed 's/a/aA/g; s/__/aB/g; s/#/aC/g' file.c | gcc -P -E - | sed 's/aC/#/g; s/aB/__/g; s/aA/a/g' | indent - | awk 'above script'. See stackoverflow.com/a/35708616/1745001 for what the seds are doing.