0

I have a piece of code to read two files, convert them to sets, and then subtract one set from the other. I would like to use a string variable (installedPackages) for "a" instead of a file. I would also like to write to a variable for "c".

a = open("/home/user/packages1.txt")
b = open("/home/user/packages.txt")
c = open("/home/user/unique.txt", "w")

for line in set(a) - set(b):
    c.write(line)

a.close()
b.close()
c.close()

I have tried the following and it does not work:

for line in set(installedPackages) - set(b):

I have tried to use StringIO, but I think I am using it improperly.

Here, finally, is how I have created installedPackages:

stdout, stderr = p.communicate()
installedPackages = re.sub('\n$', '', re.sub('install$', '', re.sub('\t', '', stdout), 0,re.MULTILINE))

Sample of packages.txt:

humanity-icon-theme
hunspell-en-us
hwdata
hyphen-en-us
ibus
ibus-gtk
ibus-gtk3
ibus-pinyin
ibus-pinyin-db-android
ibus-table
10
  • 2
    so what is your exact question? Commented May 15, 2012 at 23:28
  • How can I read a set from a variable instead of a file? Commented May 15, 2012 at 23:43
  • You... you've just gone and scrapped the newline characters. What are you wanting to make your set from? Lines? Words? Commented May 15, 2012 at 23:44
  • One word on each line.Each line should be a new entry in set. Commented May 15, 2012 at 23:48
  • Sorry, actually re.sub('\n$', '', ...) will not change the input. So that just leaves it as dead (and misleading) code. Commented May 15, 2012 at 23:51

3 Answers 3

2

If you want to write to a string buffer file-like use StringIO

>>> from StringIO import StringIO
>>> installed_packages = StringIO()
>>> installed_packages.write('test')
>>> installed_packages.getvalue()
'test'
Sign up to request clarification or add additional context in comments.

2 Comments

installed_packages.getvalue() produces what I want, but "print set(installed_packages.getvalue())" returns set(['\n', '+', '-', '.', '1', '0', '3', '2', '5', '4', '7', '6', '9', '8', ':', 'a', 'c', 'b', 'e', 'd', 'g', 'f', 'i', 'h', 'k', 'j', 'm', 'l', 'o', 'n', 'q', 'p', 's', 'r', 'u', 't', 'w', 'v', 'y', 'x', 'z'])
If the user controls the code, StringIO is often not appropriate. Its main useful spot is where there is code which is expecting a file object and can't be changed easily. It sounds in this case like the @Poweruser32 is just wanting to switch from files to strings.
1

Something like the following?

Edit: after several iterations:

from subprocess import Popen, PIPE

DEBUG = True
if DEBUG:
    def log(msg, data):
        print(msg)
        print(repr(data))
else:
    def log(msg, data):
        pass

def setFromFile(fname):
    with open(fname) as inf:
        return set(ln.strip() for ln in inf)

def setFromString(s):
    return set(ln.strip() for ln in s.split("\n"))

def main():
    # get list of installed packages
    p = Popen(['dpkg', '--get-selections'], stdout=PIPE, stderr=PIPE)
    stdout, stderr = p.communicate()
    installed_packages = setFromString(stdout)

    # get list of expected packages
    known_packages = setFromFile('/home/john/packages.txt')

    # calculate the difference
    unknown_packages = installed_packages - known_packages
    unknown_packages_string = "\n".join(unknown_packages)

    log("Installed packages:", installed_packages)
    log("Known packages:", known_packages)
    log("Unknown packages:", unknown_packages)

if __name__=="__main__":
    main()

6 Comments

Sorry I just started to learn python today. I do not understand why I would get an indentation error.File "test.py", line 16 installed_packages = setFromString(stdout) ^ IndentationError: unindent does not match any outer indentation level
I unindented the main() function, but now it is saying stdout is not defined. installed_packages = setFromString(stdout) NameError: name 'stdout' is not defined
@Poweruser32: I've included the subprocess creation, if you set the name of your script properly it ought to work.
I tried this. pastebin.com/TnStVk5w and it just gives me the raw output of the dpkg command i give it.
@Poweruser32: try printing installed_packages and known_packages, make sure they are populating correctly? If known_packages is empty, unknown_packages will be identical to installed_packages (which sounds like what you are describing).
|
1

The set data type takes an iterable as a parameter, therefore if installedPackages a string with multiple items you need to split it by the delimiter. For example, the following code would split the string by all commas:

for line in set(installedPackages.split(',')) - set(b):
    c.write(line)

2 Comments

I just tried "for line in set(installedPackages.split('\n')) - set(b):" and it produces no output
Edit your question to show the code where you assign the string to installedPackages.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.