Automating Excel Reports with Python

29 Apr 2018

guys - this is life-changing. AUTOMATE ALL YOUR EXCEL REPORTS WITH PYTHON!!

1) Basic writing of dataframe from pandas into an excel sheet

see here: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_excel.html AND http://xlsxwriter.readthedocs.io/example_pandas_multiple.html

# writing one dataframe to one excel file
df.to_excel()


# writing multiple dataframas into different sheets
writer = pd.ExcelWriter('pandas_multiple.xlsx', engine='xlsxwriter')

df1.to_excel(writer, sheet_name='Sheet1')
df2.to_excel(writer, sheet_name='Sheet2')
df3.to_excel(writer, sheet_name='Sheet3')

writer.save()

2) Write dataframe from pandas into excel sheet with number and cell color formatting:

# write to excel   
writer = pd.ExcelWriter(destination_filepath,engine='xlsxwriter')   
workbook=writer.book
worksheet=workbook.add_worksheet('sheetname')
writer.sheets['sheetname'] = worksheet

# define formats
format_num = workbook.add_format({'num_format': '_(* #,##0_);_(* (#,##0);_(* "-"??_);_(@_)'})
format_perc = workbook.add_format({'num_format': '0%;-0%;"-"'})
format_header = workbook.add_format({'bold': True, 'bg_color': '#C6EFCE'})

# write individual cells
worksheet.write(0, 2, "sheetname", format_header)
worksheet.write(1, 2, sum(df['abc']))
worksheet.write(1, 3, sum(df['def']))
worksheet.write(1, 4, sum(df['ghi']))

# write dataframe in
df.to_excel(writer,sheet_name='sheetname', startrow=2, startcol=0, index=False)  
worksheet.set_column(1, 1, 45)
worksheet.set_column(2, 4, 15, format_num)
worksheet.set_column(5, 5, 15, format_perc)
writer.save

3) Write your dataframe into pre-formatted Excel sheets

see the docs for more info: https://docs.xlwings.org/en/stable/datastructures.html

import xlwings as xw

list_of_values = [1, 2, 3] # this pastes as a row. to paste as a column, use [[1], [2], [3]]
workbook_path = 'C:/abc.xlsx' # make sure it's the FULL path, otherwise you will hit a pop-up prompt (manual input required) while overwriting existing files when saving
wb = xw.Book(workbook_path)
ws = wb.sheets['sheet1']
ws.range('E35').value = list_of_values # this can be a list or dataframe - just pick the top left cell to paste
wb.save() 
wb.close()

see here for an alternative method (which also throws an error for me - strange): https://stackoverflow.com/questions/9920935/easily-write-formatted-excel-from-python-start-with-excel-formatted-use-it-in

3) Creating a PivotTable in Excel

see here: https://stackoverflow.com/questions/22532019/creating-pivot-table-in-excel-using-python

import win32com.client
Excel   = win32com.client.gencache.EnsureDispatch('Excel.Application') # Excel = win32com.client.Dispatch('Excel.Application')

win32c = win32com.client.constants

wb = Excel.Workbooks.Add()
Sheet1 = wb.Worksheets("Sheet1")

TestData = [['Country','Name','Gender','Sign','Amount'],
             ['CH','Max' ,'M','Plus',123.4567],
             ['CH','Max' ,'M','Minus',-23.4567],
             ['CH','Max' ,'M','Plus',12.2314],
             ['CH','Max' ,'M','Minus',-2.2314],
             ['CH','Sam' ,'M','Plus',453.7685],
             ['CH','Sam' ,'M','Minus',-53.7685],
             ['CH','Sara','F','Plus',777.666],
             ['CH','Sara','F','Minus',-77.666],
             ['DE','Hans','M','Plus',345.088],
             ['DE','Hans','M','Minus',-45.088],
             ['DE','Paul','M','Plus',222.455],
             ['DE','Paul','M','Minus',-22.455]]

for i, TestDataRow in enumerate(TestData):
    for j, TestDataItem in enumerate(TestDataRow):
        Sheet1.Cells(i+2,j+4).Value = TestDataItem

cl1 = Sheet1.Cells(2,4)
cl2 = Sheet1.Cells(2+len(TestData)-1,4+len(TestData[0])-1)
PivotSourceRange = Sheet1.Range(cl1,cl2)

PivotSourceRange.Select()

Sheet2 = wb.Worksheets(2)
cl3=Sheet2.Cells(4,1)
PivotTargetRange=  Sheet2.Range(cl3,cl3)
PivotTableName = 'ReportPivotTable'

PivotCache = wb.PivotCaches().Create(SourceType=win32c.xlDatabase, SourceData=PivotSourceRange, Version=win32c.xlPivotTableVersion14)

PivotTable = PivotCache.CreatePivotTable(TableDestination=PivotTargetRange, TableName=PivotTableName, DefaultVersion=win32c.xlPivotTableVersion14)

PivotTable.PivotFields('Name').Orientation = win32c.xlRowField
PivotTable.PivotFields('Name').Position = 1
PivotTable.PivotFields('Gender').Orientation = win32c.xlPageField
PivotTable.PivotFields('Gender').Position = 1
PivotTable.PivotFields('Gender').CurrentPage = 'M'
PivotTable.PivotFields('Country').Orientation = win32c.xlColumnField
PivotTable.PivotFields('Country').Position = 1
PivotTable.PivotFields('Country').Subtotals = [False, False, False, False, False, False, False, False, False, False, False, False]
PivotTable.PivotFields('Sign').Orientation = win32c.xlColumnField
PivotTable.PivotFields('Sign').Position = 2

DataField = PivotTable.AddDataField(PivotTable.PivotFields('Amount'))
DataField.NumberFormat = '#\'##0.00'

Excel.Visible = 1

wb.SaveAs('ranges_and_offsets.xlsx')
Excel.Application.Quit()

Special feature for xlwings

see here: http://docs.xlwings.org/en/stable/quickstart.html

Bonus: Datables

Not excel but get interactive tables on your webpage!! Simple, impressive and FREE lol.

https://datatables.net/


Update (2018-10-08):

So I’ve been approached to share this link here: https://www.pyxll.com/blog/tools-for-working-with-excel-and-python/

Mostly it introduces PyXLL, by comparing against various tools for working with python and excel.

It’s always helpful to know about alternative solutions out there and their pros and cons. PyXLL seems like there’s lots of features and seems fairly powerful - if you can convince your company to fork out $250 a year, per user.

For me… I think I will stick to the free and open source alternatives for now :p