2

I'm new to Python on OSX and need to plot data from two columns within a .txt file. On windows I used the 'x[:,0]' function to denote columns though this seems to not work on Mac. I have tried the following:

f = open(os.path.expanduser("~/Desktop/a.txt.rtf"))

lines=f.readlines()

result=[]

for x in lines:
    result.append(x.split(' ')[0])

for y in lines:
    result.append(y.split(' ')[1]) 

f.close()

plt.plot(x,y)
plt.show()

But it says that the list index is out of range, even though the test file just reads:

1  2
3  4
5  6
7  8

How can that be? Please help!

After solving this I need to know the Mac alternative to the "skip_header =" function (as the file I want to use has the data I need starting 25 rows down...)

Thanks in advance, and sorry if these are easy queries but I just can't make it work :(

6
  • Thanks for your comment! The full error message is "list index out of range", referring to the line containing [1]. Commented Jan 24, 2018 at 15:41
  • I know. Tracebacks contain more valuable information than the pure error message, like in which module the error happened or on which line. This is the reason, SO asks you to post the full traceback in the question. I suggest, you edit your question and add this information. Commented Jan 24, 2018 at 15:45
  • I can't make sense of the full traceback, so I'll attempt to learn the OSX approach from elsewhere. I do hope others first posts' on stack overflow are not met with such keyboard aggression. Commented Jan 24, 2018 at 15:57
  • Whoa there OGTW. It can feel like that, but rather this is helpful instruction your getting re basic skills in posting questions and having them resolved. What you received from Piinthesky is helpful instruction in this instance not bullying or criticism. As newbies to here, we have all been through it. The request to put the full traceback information in the body of the question is standard. People read the question, not often the comments, so that is where the info needs to be. Be patient, hope you stick around. Commented Jan 28, 2018 at 6:28
  • Further comment. I like your question and you have composed and presented it well over all. Most new comers do not grasp the notion of code blocks, for example. Well done. Commented Jan 28, 2018 at 6:30

1 Answer 1

3

This is not a easy question at all. It is a very good question and many people face the same problem in their daily work. Your question will help others as well!

The error is because you are trying to read a so called Rich Text Format file (RTF). So, the real content of the file is not like what you see on screen, but coded strings.

Instead of

['1  2', '3  4',...]

f.readline() actually generate something like

['{\\rtf1\\adeflang1025\\ansi\\ansicpg1252\\uc1\\adeff31507\\deff0\\stshfdbch31505\\stshfloch31506\\stshfhich31506\\stshfbi31507\\...]

Therefore, when you try to index the splited line, you get index out of range error.

3 ideas to solve this problem. First you may consider to convert the RTF to plain text and read the text file with readline() as what you did. Or, you can read the RTF with some third party parser. Or, you can parse the RTF yourself with regular expression. Here are some useful links

convert RTF

parse RTF

Hope it is helpful.

Update

Though it is not very clear what you want to plot exactly, I guess what you really want is a scatter plot regarding the 1st and 2nd column in your data file. If that is true, you may need to modify a bit your code. Below is an example.

Assume your a.txt file (not rtf) has content

1  2
3  4
5  6
7  8

You can do this to plot a x y scatter plot with the 1st column as x 2nd column as y.

import matplotlib.pyplot as plt
f = open(os.path.expanduser("a.txt"))
lines = f.readlines()

x, y = [], []

for line in lines:
    x.append(line.split()[0])
    y.append(line.split()[1])

f.close()

print(x, y)

plt.plot(x,y)
plt.show()

Or with one-liner

f = open(os.path.expanduser("a.txt"))
lines = f.readlines()

x, y = zip(*(line.split() for line in lines))

f.close()

print(x, y)

plt.plot(x,y)
plt.show()
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you for your response, and for your reassurance that this matter is not so trivial! I converted the file to plain text and I have a new error message: Unrecognized character 7 in format string. I am yet to Google the error, but any guidance will be appreciated. Thanks.
I guess this might be due the error in decoding. Try to test with a different converter to see if there are any changes...
Hi englealuze, I tried a different converter and the same error was returned. I'm astounded that a simple task such as this is presently so challenging...
Hi @OGTW, then it is not because of decoding, and another evidence is that in the error message you get it seems it is correctly decoded to 7 - "Unrecognized character 7...". Then I think it might be the plt.plot(x,y) method doing something you are not expecting. What is this module and method exactly?
@OGTW and you may want to change a bit your code, see my update
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.