1

I have a class,com_and_url, containing 2 variable, com and url. Such as:

a = com_and_url(com, url)

And a file I want to read whose content is like this:

Google
www.google.com
yahoo
www.yahoo.com
facebook
www.facebook.com

So here is the psuedo code:

list_com_and_url = []

for com, url in f:
    list_com_and_url.append( com_and_url( com, url ) )

Can python do this? Thanks for the help!

0

4 Answers 4

2

Try this:

list_com_and_url = []
with open('my_file') as f:
    for line in f:
        name, url = line, next(f)
        list_com_and_url.append(zip(name, url))
Sign up to request clarification or add additional context in comments.

11 Comments

@richmondwang It does work, but I don't find the use of next(f) inside the loop on f to be particularly readable.
@njzk2 next goes to next line and return the text. what you suggest?
@Hackaholic I have posted an answer. What I don't like with this, is that the loop itself is manipulated, and the contract of that for loop, which is basically "do this for each line in the file", is changed halfway through. (although I agree that the contract of the loop is more something like "read a line of the file, if the line exists, go in the loop", but since it is the generic for loop structure, it is the kind of things that can be easily overlooked.)
I like your way, which is really clear and readable. However, I don't know what is zip. Is it my class named 'com_and_url' or you rename my class to 'zip'?
@MarsLee zip is a basic python function
|
1

The shortest and (I find) clearest way is using slices:

lines = f.readlines()
for com, url in zip(lines[::2], lines[1::2]):
    # Do stuff

(nota: do not attempt this with files that don't fit in memory)

1 Comment

This is a precise way, and is optimized in term of runtime. But at memory use perspective it's a really expensive approach, specially when we are dealing with huge files.
0

You can just read two lines at a time

with open('file.txt') as f:
    lines = f.read().splitlines()
    com_and_url = (lines[i:i+2] for i in range(0, len(lines), 2))

This returns a generator, so to output to a list

print(list(com_and_url))

outputs

[['Google', 'www.google.com'],
 ['yahoo', 'www.yahoo.com'],
 ['facebook', 'www.facebook.com']]

Comments

0

As an optimized way in term of memory (using iterators and refusing of loading whole of the lines in memory) you can use itertools.tee() to create two independent iterator from your file object (which is an iterator) then use itertools.islice() to put the even lines in f and odd lines in next_f then use zip() function (in python 2 itertools.izip()) to create an iterator contain the pair columns.

from itertools import tee, islice

with open('my_file') as f:
     next_f, f = tee(f)
     next_f, f = islice(f, 0, None, 2), islice(next_f, 1, None, 2)
     list_com_and_url = [com_and_url( com, url ) for com, url in zip(f, next_f)]

3 Comments

why do you consume the first line?
Thank you for your answering! This is really a whole new method for me. I need a little time to digest it! Thank you :)
that does not do what the op wants. e.g., for an input of [1,2,3,4], the output is expected to be [[1,2], [3,4]], this returns [[1,2], [2,3], [3,4]]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.