0

I am having trouble identifying the correct element in Python. What I actually want to see is the latest accessed file in the recently-used.xbel. Therefore I want to iterate over every file to find the one with the latest modified or latest visited This is how the XML file looks like.

<?xml version="1.0" encoding="UTF-8"?>
<xbel version="1.0"
      xmlns:bookmark="http://www.freedesktop.org/standards/desktop-bookmarks"
      xmlns:mime="http://www.freedesktop.org/standards/shared-mime-info"
>
  <bookmark href="file:///tmp/google-chrome-stable_current_amd64.deb" added="2021-09-14T12:09:05Z" modified="2021-09-14T12:09:05Z" visited="2021-09-15T09:12:13Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.debian.binary-package"/>
        <bookmark:applications>
          <bookmark:application name="Firefox" exec="&apos;firefox %u&apos;" modified="2021-09-14T12:09:05Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Git/testprog" added="2021-09-15T09:12:13Z" modified="2021-09-15T09:12:13Z" visited="2021-09-15T09:12:13Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="inode/directory"/>
        <bookmark:applications>
          <bookmark:application name="code" exec="&apos;code %u&apos;" modified="2021-09-15T09:12:13Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/.local/share/recently-used.xbel" added="2021-09-15T09:51:57Z" modified="2021-09-15T09:51:57Z" visited="2021-09-15T09:51:57Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/x-xbel"/>
        <bookmark:applications>
          <bookmark:application name="code" exec="&apos;code %u&apos;" modified="2021-09-15T09:51:57Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///tmp/slack-desktop-4.19.2-amd64.deb" added="2021-09-15T11:45:49Z" modified="2021-09-15T11:45:49Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.debian.binary-package"/>
        <bookmark:applications>
          <bookmark:application name="Firefox" exec="&apos;firefox %u&apos;" modified="2021-09-15T11:45:49Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Downloads/google-chrome-stable_current_amd64.deb" added="2021-09-15T11:52:39Z" modified="2021-09-15T11:52:39Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.debian.binary-package"/>
        <bookmark:applications>
          <bookmark:application name="Firefox" exec="&apos;firefox %u&apos;" modified="2021-09-15T11:52:39Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Documents/libretest" added="2021-09-15T11:58:53Z" modified="2021-09-15T11:58:53Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/octet-stream"/>
        <bookmark:applications>
          <bookmark:application name="LibreOffice 6.4" exec="&apos;soffice %u&apos;" modified="2021-09-15T11:58:53Z" count="1"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Documents/libretest.odt" added="2021-09-15T11:58:53Z" modified="2021-09-15T15:42:04Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="application/vnd.oasis.opendocument.text"/>
        <bookmark:applications>
          <bookmark:application name="LibreOffice 6.4" exec="&apos;soffice %u&apos;" modified="2021-09-15T15:42:04Z" count="12"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
  <bookmark href="file:///home/test/Git/node-socket" added="2021-09-16T13:26:25Z" modified="2021-09-16T13:26:49Z" visited="2021-09-16T13:26:26Z">
    <info>
      <metadata owner="http://freedesktop.org">
        <mime:mime-type type="inode/directory"/>
        <bookmark:applications>
          <bookmark:application name="code" exec="&apos;code %u&apos;" modified="2021-09-16T13:26:49Z" count="2"/>
        </bookmark:applications>
      </metadata>
    </info>
  </bookmark>
</xbel>

In my code I am trying to access bookmark:applications but with no success.

    home = str(Path.home())
    root = ET.parse(home + '/.local/share/recently-used.xbel').getroot()
    print(root)
    print('lower')
    for bookmark in root.iter('bookmark'):
        print(bookmark)
        for applications in bookmark.find('applications'):
            print(applications)

What would be the correct way to access bookmark:applications and find the last visited?

3
  • try using xmltodict Commented Sep 29, 2021 at 13:29
  • The bookmark:applications element is bound to the http://www.freedesktop.org/standards/desktop-bookmarks namespace (by way of xmlns:bookmark="http://www.freedesktop.org/standards/desktop-bookmarks"). See docs.python.org/3/library/… Commented Sep 29, 2021 at 13:35
  • If you need only last modified bookmark information then you can do that using modified attribute of bookmark tag, as modified attribute of bookmark and modified attribute of bookmark:application both are having same value. Commented Sep 30, 2021 at 4:31

2 Answers 2

2
from lxml import etree

NS = {"n": "http://www.freedesktop.org/standards/desktop-bookmarks"}

root = etree.parse("book.xml")
bookmarks = root.xpath("//bookmark")
most_recent_bookmark = max(
    bookmarks,
    key=lambda bmark: bmark.xpath(
        "string(.//n:application/@modified)",
        namespaces=NS,
    ),
)

print("Most recent href: " + most_recent_bookmark.xpath("string(@href)"))
print(
    "Most recent modified: "
    + most_recent_bookmark.xpath("string(.//n:application/@modified)", namespaces=NS)
)

Output:

Most recent href: file:///home/test/Git/node-socket
Most recent modified: 2021-09-16T13:26:49Z

The problem you're running into is in specifying the namespace represented by bookmark: in the original xml and by n: in the code sample. The xpath(), find() and findall() functions all let you provide a dictionary of namespaces.

https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.findall

If, as you say, the namespace may change, you can use the XPath .//*[local-name() = 'application']/@modified in place of .//n:application/@modified with no namespace parameter. (However, I would be surprised to see the producer arbitrarily changing the namespace, because it's just asking for everything that consumes your data to break. The url is as much a part of the node name as "application" is.)

Sign up to request clarification or add additional context in comments.

4 Comments

My Problem is that i will not always know the namespace. If it changes my implementation will not work. Is there no way to get it from the file. I would not want to hardcode it.
I believe .//*:application/@modified should appropriately match any namespace. Will test when I have a moment.
That said, I would be very surprised to see the namespace arbitrarily change.
Have rewritten function to be clearer, and have provided a way around a variable namespace.
1

This will be useful to access bookmark:applications and dataframe will help you get the latest visited/modified bookmark with application-name.

import xml.etree.ElementTree as ET
import pandas as pd

root = ET.parse('/content/sample.xml').getroot()
lst = []

for bookmark in bookmarklist:
  bookmark_lst = []
  print(bookmark.attrib)
  bookmark_lst.append(bookmark.attrib['href'])
  bookmark_lst.append(bookmark.attrib['modified'])
  bookmark_lst.append(bookmark.attrib['visited'])
  for ele in list(bookmark.iter()) :
    if 'application' in ele.tag:
      if 'name' in ele.attrib:
        bookmark_lst.append(ele.attrib['name'])
  lst.append(bookmark_lst)

df = pd.DataFrame(lst,columns ['href','modified','visited','application_name'])

df['modified'] = pd.to_datetime(df['modified'])
df['visited'] = pd.to_datetime(df['visited'])

least_recent_date = df['visited'].min()
most_recent_date = df['visited'].max()

3 Comments

What was the problem in the question? Don't just post code with no explanation. There is nothing about pandas in the question.
I would actually want the element and all his attributes. Not only the date. This would only show me the oldest and the newest date. I would like to fetch the element with the highest date and check the other attributes.
I works quite well up to the part with pandas. columns is not defined.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.