1

Foreword:

I'd prefer to avoid lengthy processes if possible. As a beginner, one line with lots of syntax is a little overwhelming, if I need to use something similar, please give a basic note on what they do. It's not vital that I know, it just takes the edge off. Please, point out where I'm using inefficient code and suggest better functions and/or modules, as I said I have little knowledge in python.

Situation:

I'm a newbie to pandas, but I've taken the time to play around with x.iloc[y,x] and x.loc[y,x] (where x is pd.read_excel('/my/file/name.xlsx', sheet_name='sheet1'))and, at least for the given formats, I understand what makes them tick. I know that these are going to be useful to me for my macros. I'm on Linux so VBA isn't an easy option, and PyUNO for LibreOffice is a project I'm putting off for a while. I'm expecting that the above functions aren't the best way to select a cell in excel from python.

What I've found:

Too much. For a beginner like me, most of the tutorials are very complex with little explanation; I can make the code there work, I just have no clue why it works that way. I've mostly found information relating to standard 'in-house' python databases, and it seems that the excel related articles are few and far between, the ones I've read unfortunately relate to more advanced functions. I could probably learn them, but I'm not currently interested.

The issue:

Lets take a look at this code I wrote earlier, with a little help from pythonbasics.org;

import pandas as pd
import xlrd #not sure if this is needed, thonny assistant says its not, website says it is
df = pd.read_excel('/home/myname/Desktop/sheetname.xlsx', sheet_name='sheet1')
p = df.loc[5, 5]
p = str(p) #unsure if this does anything, I haven't got a write to A.txt either way
path = "/home/mynmae/Desktop/A.txt"
text_file = open(path, "w")
text_file.write('%s' % p)
text_file.close

Lets get rid of the mess. First, I read sheetname.xlsx and assign it to df

df = pd.read_excel('/home/myname/Desktop/sheetname.xlsx', sheet_name='sheet1')

Now I try reading cell F6, lets keep in the string conversion for p

p = df.loc[5, 5]
p = str(p) 

now that we've got p, lets open up a text file on my desktop

path = "/home/mynmae/Desktop/A.txt"
text_file = open(path, "w")

All that's left is to 'paste' p into the text file. We opened it with 'w' so we can write over the file. p is a string, so we write with ('%s' % p)

text_file.write('%s' % p)
text_file.close

Now we should have the value of F6 (lets say its "hello") in A.txt! Lets see:

A.txt;
   

..Oh

What I know:

All the write stuff works in a second program I have. The only difference is p is replaced by another string variable, I would guess that isn't the issue. However, when I call print(p) after converting p = str(p) it gives me what I want, with the headers in place. I would like to remove the headers but that's for later.

My question:

given spreadsheet 'sheetname.xlsx' and worksheet 'sheet1', using pandas (or a better module for spreadsheet work if there is one) how can I assign the value of cell F6 (or any cell, switching up my selection is easy) to the variable p?

2 Answers 2

2

Solution to your problem:

You're going to be livid at how silly the fix is. You forgot to put () after text_file.close. You're not executing the .close() function. It isn't throwing a runtime error because it's just returning the value of the .close function in that line. It is then moving onto the following lines of code.

Please try this:

path = "/home/myname/Desktop/A.txt"
text_file = open(path, "w")
text_file.write('%s' % p)
text_file.close()

Additional:

  • For the functionality you're using, you must have the xlrd module installed in your environment, but it doesn't need to be imported.
  • If you want to use integer indices freely on both dimensions for df.loc(), I suggest you use the argument header=None on pd.read_excel().

Excel File:

    +----+----+----+
    | A  | B  | C  |
+------------------+
|   |    |    |    |
| 1 | hi | hi | hi |
|   |    |    |    |
+------------------+
|   |    |    |    |
| 2 | hi | hi | hi |
|   |    |    |    |
+------------------+
|   |    |    |    |
| 3 | hi | hi | hi |
|   |    |    |    |
+---+----+----+----+

With automatic headers: (These are strings. You must do df.loc[0, "hi.0"])

import pandas as pd
df = pd.read_excel('Book1.xlsx', sheet_name='Sheet1')
df.head()

Output:

    +--------------+
    |hi.1|hi.2|hi.3|
+------------------+
|   |    |    |    |
| 0 | hi | hi | hi |
|   |    |    |    |
+------------------+
|   |    |    |    |
| 1 | hi | hi | hi |
|   |    |    |    |
+------------------+

Without headers: (These are integers. You can safely do df.loc[0, 2])

import pandas as pd
df = pd.read_excel('Book1.xlsx', sheet_name='Sheet1', header=None)
df.head()

Output:

    +----+----+----+
    | 0  | 1  | 2  |
+------------------+
|   |    |    |    |
| 0 | hi | hi | hi |
|   |    |    |    |
+------------------+
|   |    |    |    |
| 1 | hi | hi | hi |
|   |    |    |    |
+------------------+
|   |    |    |    |
| 2 | hi | hi | hi |
|   |    |    |    |
+---+----+----+----+
  • The reason why I tried to show that difference is, when you said that, "when I call print(p) after converting p = str(p) it gives me what I want, with the headers in place", I am worried about what you mean by 'with headers'. It is supposed to be a string. If you're getting the headers in this pattern, then maybe that you're testing the first row only.
Sign up to request clarification or add additional context in comments.

2 Comments

Well, time to go jump off a bridge. Thank you for pointing out the parenthesis, thats probably why Thonny was saying that line did nothing. As for the rest of your answer, I'll give it a read after lunch!
The method you showed with pandas is what I was trying to do all morning, I pasted your code and made a few tweaks and it does what I wanted it to! I think I'll use xldr as naccode suggested because it seems more like VBA which I'm used to, but thank you for the explanation!
0

You can do it without Pandas, using module xlrd:

import xlrd

workbook = xlrd.open_workbook('sheetname.xlsx')
worksheet = workbook.sheet_by_name('sheet1')

# Read specific cell and store it in variable:
value = worksheet.cell(row, column)

# row and column are indexed as Python does, so cell 'A1' is (0,0)

1 Comment

I can't believe how easy that is! Seems excactly the same as VBA's Range("LetterNumber") function. I'll read up on xlrd and get my system up and running. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.