Skip to content Skip to sidebar Skip to footer

Finding Excel Cell Reference Using Python

Here is the Excel file in question: Context: I am writing a program which can pull values from a PDF and put them in the appropriate cell in an Excel file. Question: I want to writ

Solution 1:

Consider a translated VBA solution as the Match function can adequately handle your needs. Python can access the Excel VBA Object Library using a COM interface with the win32com module. Please note this solution assumes you are using Excel for PC. Below includes the counterpart VBA function.

VBA Function (native interface)

If below function is placed in Excel standard module, function can be called in spreadsheet cell =FindCell(..., ###)

' MATCHES ROW AND COL INPUT FOR CELL ADDRESS OUTPUTFunction FindCell(item AsString, year AsInteger) AsString    
    FindCell = Cells(Application.Match(item, Range("A1:A5"), 0), _
                     Application.Match(year, Range("A1:E1"), 0)).Address    
EndFunction

debug.Print FindCell("COGS", 2014) 
' $C$3

Python Script (foreign interface, requiring all objects to be declared)

Try/Except/Finally is used to properly close the Excel process regardless of script success or fail.

import win32com.client

# MATCHES ROW AND COL INPUT FOR CELL ADDRESS OUTPUTdefFindCell(item, year):
    return(xlWks.Cells(xlApp.WorksheetFunction.Match(item, xlWks.Range("A1:A5"), 0), 
                       xlApp.WorksheetFunction.Match(year, xlWks.Range("A1:E1"), 0)).Address)

try:
    xlApp = win32com.client.Dispatch("Excel.Application")
    xlWbk = xlApp.Workbooks.Open('C:/Path/To/Workbook.xlsx')
    xlWks = xlWbk.Worksheets("SHEETNAME")

    print(FindCell("COGS", 2014))
    # $C$3except Exception as e:
    print(e)

finally:    
    xlWbk.Close(False)
    xlApp.Quit

    xlWks = None
    xlWbk = None
    xlApp = None

Solution 2:

There are a surprising number of details you need to get right to manipulate Excel files this way with openpyxl. First, it's worth knowing that the xlsx file contains two representations of each cell - the formula, and the current value of the formula. openpyxl can return either, and if you want values you should specify data_only=True when you open the file. Also, openpyxl is not able to calculate a new value when you change the formula for a cell - only Excel itself can do that. So inserting a MATCH() worksheet function won't solve your problem.

The code below does what you want, mostly in Python. It uses the "A1" reference style, and does some calculations to turn column numbers into column letters. This won't hold up well if you go past column Z. In that case, you may want to switch to numbered references to rows and columns. There's some more info on that here and here. But hopefully this will get you on your way.

Note: This code assumes you are reading a workbook called 'test.xlsx', and that 'COGS' is in a list of items in 'Sheet1!A2:A5' and 2014 is in a list of years in 'Sheet1!B1:E1'.

import openpyxl

defget_xlsx_region(xlsx_file, sheet, region):
    """ Return a rectangular region from the specified file.
    The data are returned as a list of rows, where each row contains a list 
    of cell values"""# 'data_only=True' tells openpyxl to return values instead of formulas# 'read_only=True' makes openpyxl much faster (fast enough that it # doesn't hurt to open the file once for each region).
    wb = openpyxl.load_workbook(xlsx_file, data_only=True, read_only=True)  

    reg = wb[sheet][region]

    return [[cell.value for cell in row] for row in reg]

# cache the lists of years and items# get the first (only) row of the 'B1:F1' region
years = get_xlsx_region('test.xlsx', 'Sheet1', 'B1:E1')[0]
# get the first (only) column of the 'A2:A6' region
items = [r[0] for r in get_xlsx_region('test.xlsx', 'Sheet1', 'A2:A5')]

deffind_correct_cell(year, item):
    # find the indexes for 'COGS' and 2014
    year_col = chr(ord('B') + years.index(year))   # only works in A:Z range
    item_row = 2 + items.index(item)

    cell_reference = year_col + str(item_row)

    return cell_reference

print find_correct_cell(year=2014, item='COGS')
# C3

Post a Comment for "Finding Excel Cell Reference Using Python"