1

I want to create list of lists. One specific list should contain all keywords from one specific xml file from my folder. That mean that number of lists is equal number of files. Problem is that I don't know how many files are in folder. I try create list of lists in loop, like this :

my_keywords <-
  list(my_keywords,m) 

But result have too many nested lists. So I try create matrix of list and after loop convert matrix to list of lists. This is my code :

  #read all xml files in folder data
    f <- list.files(path = "C:\\data\\", pattern = "*.xml", all.files = FALSE,
               full.names = TRUE, recursive = FALSE)

    keywords_matrix <- matrix("", ncol=1, nrow = length(f))
    i<-0

    #for each xml file read and save all keywords
    for (sig in f) {
        i<-i+1
        data_xml <- xmlTreeParse(sig,useInternalNodes=TRUE)
        xml_list <- xpathApply(data_xml, "//keyword", xmlValue)
        #every keyword is in own list, give all keywords to one list
        m <- unlist(xml_list)
        #every keywords list in one row of matrix
        keywords_matrix[i,] <-list(m)

    }

    print (keywords_matrix)
    mylist <- apply(keywords_matrix, 1, as.list)

But my code don't work. It gives me these errors :

> Error in keywords_matrix[i, ] <- list(m) : 
>  incorrect number of subscripts on matrix

---

>    Error in apply(keywords_matrix, 1, as.list) : 
>       dim(X) must have a positive length

And my matrix look like :

[[1]]
[1] "a11" "a12" "a13"        

[[2]]
[1]  ""

[[3]]
[1]  "" 

What I want is mylist that look like :

[[1]]
"a11" "a12" "a13"        

[[2]]
"b11"        "b12"       "b13" "b14" 

[[3]]
 "c11"        "c12"      

Any help ? Because I have no idea why this don't work. Index in matrix look OK to me.

My xml files look like :

<rule id="1">
    <date>2018-01-12</date>
    <name>name of A element</name>
    <allkeywords>  
        <keyword>a11</keyword>
    <keyword>a12</keyword>
     <keyword>a13</keyword>    
     </allkeywords>  
</rule>

And :

  <rule id="2">
        <date>2018-01-12</date>
        <name>name of B element</name>
        <allkeywords>  
            <keyword>b11</keyword>
        <keyword>b12</keyword>
         <keyword>b13</keyword>  
        <keyword>b14</keyword> 
         </allkeywords>  
    </rule>
2
  • What you're talking about doesn't appear to be a list of lists, just a regular list with elements. Can you make your example minimal and reproducible? A cursory look at your code makes me think you might want to convert your loop into an lapply and that would do it? Commented Jun 3, 2018 at 15:14
  • well I don't know how to minimize code, this already is minimal code, my original code read from xml file much more information like name of specific keywords etc. It is possible reproducible it, if you create data folder in C:, and create 2 xml files in this folder. Commented Jun 3, 2018 at 15:45

2 Answers 2

4

Your intended result isn't a nested list, it's just a regular list of vectors. Your code can work simply by initializing an empty list and adding each element to it as you loop through.

library(XML) 

f <- list.files(pattern = "*.xml", all.files = FALSE,
                  full.names = TRUE, recursive = FALSE)
i<-0
mylist<-list() #initialize list 

for (sig in f) {
  i<-i+1
  data_xml <- xmlTreeParse(sig,useInternalNodes=TRUE)
  xml_list <- xpathApply(data_xml, "//keyword", xmlValue)
  m <- unlist(xml_list)
  mylist[[i]]<-m #add each element to list
}

mylist 

[[1]]
[1] "a11" "a12" "a13"

[[2]]
[1] "b11" "b12" "b13" "b14"

or you can do it with a lapply

mylist<-lapply(f, function(x){
  data_xml <- xmlTreeParse(x,useInternalNodes=TRUE)
  xml_list <- xpathApply(data_xml, "//keyword", xmlValue)
  m <- unlist(xml_list)
})
Sign up to request clarification or add additional context in comments.

Comments

1

Simply run lapply on list of XML files using XML's xpathSApply. No need to use a matrix as an in-between helper container.

library(XML)

#read all xml files in folder data
f <- list.files(path = "C:\\data\\", pattern = "*.xml", all.files = FALSE,
                full.names = TRUE, recursive = FALSE)

mylist <- lapply(f, function(i){
  data_xml <- xmlTreeParse(i, useInternalNodes=TRUE)
  xml_list <- xpathSApply(data_xml, "//keyword", xmlValue)
})

mylist

# [[1]]
# [1] "a11" "a12" "a13"

# [[2]]
# [1] "b11" "b12" "b13" "b14"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.