1

Trying to extract xml data from multiple string outputs dynamically (the data changes) into a column format.

About 100 of these XML bits echo out when I run a query against an SQL database.

<?xml version="1.0"?>
<Connection>
  <ConnectionType>Putty</ConnectionType>
  <CreatedBy>Someone</CreatedBy>
  <CreationDateTime>2014-10-27T11:53:59.8993492-04:00</CreationDateTime>
  <Events>
    <OpenCommentPrompt>true</OpenCommentPrompt>
    <WarnIfAlreadyOpened>true</WarnIfAlreadyOpened>
  </Events>
  <Group>Cloud Services Client Delivery\Willis\Linux\Test - SJC</Group>
  <ID>77e96d52-f165-482f-8389-ffb95b9d8ccd</ID>
  <KeyboardHook>InFullScreenMode</KeyboardHook>
  <MetaInformation />
  <Name>Hostname-H-A10D</Name>
  <OpenEmbedded>true</OpenEmbedded>
  <PinEmbeddedMode>False</PinEmbeddedMode>
  <Putty>
    <PortFowardingArray />
    <Scripting />
    <SessionHost>10.0.0.100</SessionHost>
    <SessionName>10.0.0.100</SessionName>
    <TelnetEncoding>IBM437</TelnetEncoding>
  </Putty>
  <ScreenColor>C24Bits</ScreenColor>
  <SoundHook>DoNotPlay</SoundHook>
  <Stamp>771324d1-0c59-4f12-b81e-96edb5185ef7</Stamp>
</Connection>

And what I need is the and in a column format. And essentially where the hostname equal Hostname-H-A10D, I want to be able to match the D at the end and mark the First column with Dev, Q as Test and no letter at the end as Prod. So the output would look like -->

Dev Hostname-H-A10D 10.0.0.100
Dev Hostname-H-A11D 10.0.0.101
Prod Hostname-H-A12 10.0.0.201
Test Hostname-H-A13Q 10.0.0.10

I have played around with sed/awk/etc and not just cannot get the format I want without writing out temp flat files. I would prefer to get this into an array using something like xmlstarlet or xmllint. Of course better suggestions can be made and that is why I am here :) Thanks folks.

2 Answers 2

3

It would be better to use an XML parser.

Using awk:

$ awk -F'[<>]' 'BEGIN{a["D"]="Dev";a["Q"]="Test"} /Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Prod";} /SessionHost/{print type, name, $3;}' s.xml
Dev Hostname-H-A10D 10.0.0.100

How it works

  • BEGIN{a["D"]="Dev";a["Q"]="Test"}

    This defines associative array a.

  • /Name/{name=$3; type=a[substr(name,length(name))]; if (length(type)==0) type="Prod";}

    On the line that has the host name, this captures the host name and, from it, determines the host type.

  • /SessionHost/{print type, name, $3;}

    On the line that contains the host IP, this prints the type, name, and IP.

Sign up to request clarification or add additional context in comments.

2 Comments

Perfect! Very nice. I was not aware of the array building within awk. Thank you sir.
Would also be able to help here as well? Thanks. stackoverflow.com/questions/26694555/…
0

You have not mentioned any parameter in XML file whether the host is Dev or Prod or Test.

But from the above XML file you can get the name using the following way.

$cat test.xml |grep Name |awk -F '[<,>]'  '{print $3}' |xargs
Hostname-H-A10D 10.0.0.100

1 Comment

Here you do several wrong thing. 1. You do not need to use cat with program that can read data itself, like grep and awk. 2. You do not need to use both grep and awk. Here is all done with one command: awk -F '[<,>]' '/Name/ {printf "%s ",$3} END {print ""}' test.xml

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.