6

I wrote a python script that will pull excel files from a folder and write them into a SQL table. I got the code to work, but only if I delete the first line of the excel file which contains the headers. I'm new to Python so this is probably something simple, but I looked at a lot of different techniques and couldn't figure out how to insert it into my code. Any ideas would be greatly appreciated!

# Import arcpy module
from xlrd import open_workbook ,cellname
import arcpy
import pyodbc as p

# Database Connection Info
server = "myServer"
database = "my_Tables"
connStr = ('DRIVER={SQL Server Native Client 10.0};SERVER=' + server + ';DATABASE=' + database + ';' + 'Trusted_Connection=yes')

# Assign path to Excel file
file_to_import = '\\\\Location\\Report_Test.xls'

# Assign column count
column_count=10

# Open entire workbook
book = open_workbook(file_to_import)

# Use first sheet
sheet = book.sheet_by_index(0)

# Open connection to SQL Server Table
conn = p.connect(connStr)

# Get cursor
cursor = conn.cursor()

# Assign the query string without values once, outside the loop
query = "INSERT INTO HED_EMPLOYEE_DATA (Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"

# Iterate through each row

for row_index in range(sheet.nrows):

    row_num          = row_index
    Company          = sheet.cell(row_index,0).value
    Contact          = sheet.cell(row_index,1).value
    Email            = sheet.cell(row_index,2).value
    Name             = sheet.cell(row_index,3).value
    Address          = sheet.cell(row_index,4).value
    City             = sheet.cell(row_index,5).value
    CentralCities    = sheet.cell(row_index,6).value
    EnterpriseZones  = sheet.cell(row_index,7).value
    NEZ              = sheet.cell(row_index,8).value
    CDBG             = sheet.cell(row_index,9).value

    values = (Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG)

    cursor.execute(query, values)

# Close cursor
cursor.close()

# Commit transaction
conn.commit()

# Close SQL server connection
conn.close()
2
  • 2
    Sounds like you just need to change for row_index in range(sheet.nrows): to for row_index in range(1, sheet.nrows): to skip the first row. Commented May 31, 2013 at 14:51
  • There is nothing in Excel which says "this is a header row" so you either have to know how many lines to skip (Aya's comment) or guess. Commented May 31, 2013 at 14:52

2 Answers 2

10

You can initialize the iteration at the second row. Try the following:

for row_index in range(1,sheet.nrows):

Edit: If you need to iterate over a list of .xls files, as you asked in the comments, the basic idea is to perform an external loop over the files. Here it comes some hints:

# You need to import the os library. At the beinning of your code
import os

...
# Part of your code here
...

# Assign path to Excel file
#file_to_import = '\\\\Location\\Report_Test.xls'
folder_to_import = '\\\\Location'
l_files_to_import = os.listdir(folder_to_import)
for file_to_import in l_files_to_import:
    if file_to_import.endswith('.xls'):
        # The rest of your code here. Be careful with the indentation!
        column_count=10
        ...
Sign up to request clarification or add additional context in comments.

9 Comments

Aya, it worked brilliantly! Thank you! I figured it was something really obvious and I was just over-thinking it.
If I wanted to take it further and select all the excel files located in a folder.... would it be something like '*.xls' ? I tried this but it doesn't seem to like my syntax..
You need an external loop to iterate over the files. You can execute a command to list all files and then iterate over them. This would be an external loop to the one you already have
Where in the script would I put the external loop? I was thinking of changing the file_to_import to only point to the folder containing all the XLS files then start the external loop right after that. I'm sorry, i'm very new to python.
Hey Miquel, thank you for the help! The only thing that is keeping the script from running is that i'm getting an "IOError: [Errno 13] Permission denied". I have all the permissions correctly set to the folder where the XLS files are that it's trying to read from, so i'm not sure what the issue is.
|
0

or if you're lazy:

rows = 0
for row in ws.rows :
    if rows == 0 : 
        rows += 1
        continue
    ...

The advantage is that you won't have to fool with complicated cell dereferencing: you have the row object and can just do a row[col].value. Tidier.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.