0

I try to parse some xml file, with this code:

   function parseXml() {
      var url = 'http://www.inpo.ru/documents/pricelists/pricelist.xml';
      var xml = UrlFetchApp.fetch(url).getContentText();

      var parseregexp = new RegExp (/.*em><no>(\d+)<\/no><title>(.+?)<\/title><price vat="\w+">(\d+.\d+|\d+)<\/price><unit>(.+?)<\/unit><free>(\d+)<\/free>(.|\s)*?<it/g)
      var parsedData = '$1 $2 $3 $4 $5 '
      var rangeRegex = [];

      var Pdata = xml.replace(parseregexp,parsedData)

      var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet()
      ss.getRange(1, 1).setValue(Pdata)
      }

And here is xml fragment:

<item><no>48514</no><title>The workpiece is the rod d 8x150mm P6AM5 HRC 64-66" CNIC"</title><price vat="yes">154.58</price><unit>Pcs</unit><free>59</free><remarks>Used to make an axial tool.
Hardness HRC64-66</remarks><img thumbnail="http://www.inpo.ru/index/I:48528/THUMBNAIL:0.jpg">http://www.inpo.ru/index/I: 48528 / PREVIEW: 0.jpg</img></item><item><no>48515</no><title>The workpiece is the rod d 8x200mm P6AM5 HRC 64-66"CNIC"</title><price vat="Yes">198.24</price><unit>pcs</unit><free>32</free><remarks>Used to make an axial tool.
Hardness HRC64-66</remarks><img thumbnail="http://www.inpo.ru/index/I:48528/THUMBNAIL:0.jpg">http://www.inpo.ru/index/I: 48528 / PREVIEW: 0.jpg</img></item>

And result in Pdata is:

48514 The workpiece is the rod d 8x150mm P6AM5 HRC 64-66" CNIC" 154.58 Pcs 59 48515 The workpiece is the rod d 8x200mm P6AM5 HRC 64-66"CNIC" 198.24 pcs 32

In this example in Pdata I have 1 long string from all regex matches. How I can make an array with 5 columns from all matches? I think to push every match to array with "for" cycle, but dont know how it's mades. Would be pleasefull for any help

5
  • You should post the value assigned to the variable Pdata. Otherwise we must run your code. We shouldn't need to do that. It seems like you may need to split the data into elements in an array. Commented Apr 8, 2017 at 21:44
  • @SandyGood Yes, you are right, I added it in question Commented Apr 8, 2017 at 21:57
  • Where do you want to split the content? at 48514 and 48515? Commented Apr 8, 2017 at 22:16
  • I want to store 48514 and 48515 in cells A2 and A3. And next matches from $2 $3 $4 $5 entires in B2 C2 D2 E2 and B3 C3 D3 E3 Commented Apr 8, 2017 at 22:40
  • Have you tired using this! Xmlservice Commented Apr 8, 2017 at 23:04

1 Answer 1

1

Alternate Solution: Since you are trying to access xml data you can use XMLservice.parser

However there seems to a problem in the fetch call, I was unable to get the whole data (fetch gives a truncated file, perhaps it is timing out, 16mb file) , so I downloaded the data file and uploaded it into google drive. This file could be used to parse XML data like so:

function parseXml() {

      var file = DriveApp.getFileById("Xml File ID")   //Get the id of the uploaded file and replace it for "Xml File ID"
      var xml = file.getBlob().getDataAsString()
      // The below code gave a error for XML parser
     /*var url = 'http://www.inpo.ru/documents/pricelists/pricelist.xml';
       var options = {
       'method' : 'get',
       'contentType': 'application/xml',
       }
      var xml = UrlFetchApp.fetch(url,options).getBlob().getDataAsString()
      Logger.log(xml)*/
     // End of code with gave an error

 var arrayItems = []
 var XmlElem = ["no","title","price","unit","free"]       //Elements to look for
 var document = XmlService.parse(xml);
 var RCounter = 0
 var groups = document.getRootElement().getChildren();   //GetGroup Element
    for(var k = 0; k< groups.length; k++){               // Loop through each group element
      var main = groups[k].getChildren()                // Get sub groups in each group
  for (var j=0 ; j < main.length; j++){                 // Loop through each subGroups

   var mainChilds = main[j].getChildren()               //Get items in each subGroups
   for (var l = 0 ; l < mainChilds.length; l++){        // Loop through each items
   var items = mainChilds[l].getChildren();             // Get  elemetns like "no","title","price","units","free" in each item 
     arrayItems[RCounter] = []
   var total = 0;
    for (var i = 0; i < items.length; i++) {
     // Logger.log(items[i].getName())
      var index = XmlElem.indexOf(items[i].getName())       //Look for items and place the value at corresponding index
      if(index != -1)
        arrayItems[RCounter][index] = items[i].getValue()
     }  //End Loope for elements
     if(arrayItems[RCounter].length > 0)                   //in case the array is empty, reuse it
      RCounter++
    }   // End loop for items
    }   // End loop for sub Groups
    }   // End loop for Groups
  Logger.log(arrayItems) 
}

Hope that helps!

Sign up to request clarification or add additional context in comments.

3 Comments

Yes, it works perfect, Thak you Jack! In combination with download funktion, it gaves great tool for refreshing data. But I have only one question, is this code applyeble for other xml files with different structure, if I just change XmlElem var?
If the structure of the XML file remains the same, yes! The above solution looks for the following structure {group} {group}{item}{XML element}{/item}{/group}{/group}
Hey again, Jack! Script, that you create, worked perfect till today.. I have an errore: XML document structures must start and end within the same entity. (at string: "var document = XmlService.parse(xml);") I assume that the file could be loaded with an error, how can I avoid it in the future? If it makes it difficult for you then it's okay if my comment remains unanswered

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.