3

Input file:

<?xml version="1.0" encoding="UTF-8"?> 
        <books>
            <book id="6636551">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">72771KAM3</xref>
                        <xref type="Non_Fiction" type_id="2">US72771KAM36</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>24.95</price>
                    <publish_date>2000-10-01</publish_date>
                    <description>An in-depth look at creating applications with XML.</description>
                </book_details>
            </book>
            <book id="119818569">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">070185UL5</xref>
                        <xref type="Non_Fiction" type_id="2">US070185UL50</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>19.25</price>
                    <publish_date>2002-11-01</publish_date>
                    <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
                </book_details>
            </book>
            <book id="119818568">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">070185UK7</xref>
                        <xref type="Non_Fiction" type_id="2">US070185UK77</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>5.95</price>
                    <publish_date>2004-05-01</publish_date>
                    <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description>
                </book_details>
            </book>
            <book id="119818567">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">070185UJ0</xref>
                        <xref type="Non_Fiction" type_id="2">US070185UJ05</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>4.95</price>
                    <publish_date>2000-09-02</publish_date>
                    <description>When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled.</description>
                </book_details>
            </book>
        </books>

I wrote XQuery to display the count of a specific field as follows:

for $x in //book_xref
let $c := string-join(('name of element:', count($x)), '&#10;')
return $c

The intended output:

name of element: 4

but the output comes out as:

name of element:
1
name of element:
1
name of element:
1
name of element:
1

After which I understood why it is doing that. I tried aggregating the count value but that didn't work out. Also, couldn't find any function to fetch the name of the element automatically so that it's automatically included in the string.

Ideally, the target output is

book_xref:4

What do I need to achieve that? What am I missing?

Thanks! I appreciate your response.

2 Answers 2

3

How about just concat('book_xref:', count(//book_xref))?

The reason you're getting 4 different results in the output is because you're iterating over all occurrences of book_xref.

Also, you could've gotten the name by using $x/name(), but since you already know what you're selecting it's not necessary.

One easy, but not very efficient, way to get all element names with their number of occurrences would be:

let $names := distinct-values(//*/name())
for $x in $names
let $c := concat($x, ':', count(//*[name()=$x]), '&#10;')
return $c

which produces:

books:1
book:4
master_information:4
book_xref:4
xref:8
book_details:4
price:4
publish_date:4
description:4
Sign up to request clarification or add additional context in comments.

6 Comments

Makes sense. Thanks! The complicated part - I wanted to know the name thingy so that I can expand the query to apply to all nodes in the XML file. But now it seems harder than I thought. How can I select all and only distinct nodes and show their count of occurrences?
I tried this code to get the name automatically let $x := //book_xref let $c := count($x) return concat($x/name(), $c) but it doesn't work. What's wrong?
@Fenil - $x would be 4 occurrences of book_xref so when you do $x/name() in the concat you should get an error about the first argument to concat being a sequence of one or more characters.
Makes sense. I was able to make it work now. Regarding your code for fetching all nodes, that worked like a charm on my smaller test files but failed to execute on a regular large file due to memory allocation scarcity. To what you said about efficiency, how does one optimize a query like that? What will be your efficient way to do it?
@Fenil: Please accept this answer if it's helped, and ask your new questions as actual questions themselves -- they're too significant for comment-based follow-up. Thanks.
|
2

For your initial goal, @daniel-haley already has a concise solution.

If you want to efficiently count the number of occurrences of all element names in a document, in XQuery 3.0 you can use maps and the fn:fold-left(...) function to iteratively process all elements and keep counts that (at least in XQuery processors that support iterative evaluation) never have all elements in memory at the same time, not even the ones of the same name:

fold-left(
  //*,
  map{},
  function($map, $node) {
    let $name := $node/local-name()
    return map:merge((map { $name: ($map($name), 0)[1] + 1 }, $map))
  }
)

The simpler solution using group by is a lot more readable, but potentially less memory efficient:

map:merge(
  for $node in //*
  group by $name := $node/node-name()
  return map { $name: count($node) }
)

The result is the same for both queries:

map {
  'price': 4,
  'book': 4,
  'books': 1,
  'book_details': 4,
  'master_information': 4,
  'book_xref': 4,
  'xref': 8,
  'description': 4,
  'publish_date': 4
}

1 Comment

Thanks, Leo! I'm new to XQuery and this is helpful to know! Appreciate your input.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.