Why Combine Python and Excel?

Wait 5 sec.

Why should you combine Python and Excel? In short, Python removes the limitations set by Excel’s built-in formulas. The same reasons why Python is one of the most popular languages for application developers apply to those working with Python in Excel. Working with Python in Excel brings Python’s advanced analytics libraries (pandas, NumPy), automation capabilities, data visualization tools (Matplotlib) and scalability to Excel.Prerequisites and SetupInstalling Python and pipDownload the latest version of Python from python.org.During installation, check the box for “Add Python to PATH.”Install or upgrade pip (Python’s package manager):View the code on Gist.Test the installation:View the code on Gist.Output: Most recent version of Python and pipOnce you have recent versions of Python and pip installed on your machine, we can move over to setting up Excel.Enabling Excel-Python Add-Ins (xlwings, PyXll)xlwingsxlwings is a free, open source Excel-Python bridge allowing you to call Python scripts from Excel. It’s ideal for automating reports, cleaning data and building custom functions without Visual Basic for Applications (VBA).First, install on your machine using the terminal:View the code on Gist.Then enable it from Excel:Open Excel, then Developer Tab, then Excel Add-Ins, and check “xlwings.”Use the RunPython macro to execute Python code from a cell.PyXLLPyXLL is a professional-grade (paid) Excel-Python integration tool designed for production environments. It’s more advanced than xlwings. It allows you to create custom Excel functions in Python, run Python code when workbooks open and integrate with advanced data science libraries.Download from pyxll.com.Install and activate your license.Configure your pyxll.cfg (or .ini) file to set Python paths and options.Restart Excel and start using Python functions directly from cells.Use xlwings for free, open source Excel automation and quick Python integrations (e.g., pulling data from a CSV, cleaning it with pandas and sending it back to Excel). Choose PyXLL for advanced, production-grade solutions that require high performance, custom functions and professional support (e.g., creating a live market data feed in Excel with real-time calculations).This tutorial will use xlwings over PyXLL since xlwings is the open source option.Trying Microsoft 365 “Python in Excel” PreviewMicrosoft 365’s “Python in Excel” (beta) is great in cases where you need immediate access to Python inside Excel without any installs. It’s not a true replacement for xlwings or PyXLL as those tools have more robust functionalities. Use cases for Microsoft 365 “Python in Excel” are:Tasks are simple or exploratory.Experimenting or learning Python in Excel.Microsoft 365 “Python in Excel” gives you immediate access to Python libraries like pandas and Matplotlib using the =PY() function.You can try the beta if you’re part of the Microsoft 365 Insiders Beta Channel. Open Excel and enter a formula like this:View the code on Gist.This function will run inside the cell.Running Python Code From ExcelThere are a few ways to do this based on your needs and preferences.Inline Python CellsRun small Python scripts using the =PY() function directly inside Excel cells. Think lightweight calculations and simple data transformations.View the code on Gist.User-Defined Functions (UDFs)UDFs are custom functions that integrate Python logic into spreadsheets. You can create UDFs when using xlwings and PyXLL. UDFs are great for real-time data integration, custom financial calculations and data validation and cleaning.How to create a UDF:Build a Python code file:Open a text editor or IDE (e.g., VS Code, PyCharm, Sublime, etc.).Save a new file, for example py.Write the code (e.g., basic addition code).View the code on Gist.Connect to Excel (must have xlwings installed and xlwings enabled in Excel):Point the add-in to your udfs.py file in the xlwings settings in Excel.Call it from Excel:In any cell, type the function name as defined in py.View the code on Gist.Excel will run the Python code and return 15.VBA-To-Python InteroperabilityIf you’re already using VBA macros, you can leverage them to trigger Python scripts using xlwings or PyXLL. This allows you to introduce Python to your VBA workflow without rebuilding everything from scratch.Similar to the UDF example above, you need to have xlwings installed (or PyXLL depending on which tool you’re using) and xlwings enabled in Excel.Create a Python code fileThe file needs to contain a main() function to match the VBA call. Let’s call the file vba_file.py.View the code on Gist.Configure the path in xlwings settings (or in your VBA code) to make sure Python knows where your script is located.When you run the VBA macro inside Excel, it will use xlwings (or PyXLL). After Python executes the code, the results can be sent back into Excel or used to update your workbook.Data Analysis WorkflowsThese workflows will not use inline code. They will follow closely along with the UDF example above (but not all these functions are UDFs). That means you’ll have to point xlwings to the name of your Python script file in Excel or configure it properly using your VBA RunPython calls.Importing Data With pandasUse the pandas library to import and analyze data from Excel files. The example below reads data from an Excel file, performs basic analysis and writes the results back to the workbook.View the code on Gist.Visualizing With MatplotlibMatplotlib is a data visualization tool. The example below reads data from an Excel file, generates a bar chart, saves it as an image and inserts the chart into the workbook.View the code on Gist.The machine learning (ML) example below uses pandas to handle Excel data and joblib to load the pre-trained model for making predictions. The tools work together to load data from the Excel file. It then applies a pre-trained model that predicts outcomes and writes results back to the workbook.View the code on Gist.Reading and Writing Excel files From Pythonopenpyxl and xlsxwriter are popular Python libraries for working with Excel files.openpyxl Basicsopenpyxl is a great tool for when you need to read or modify existing Excel files. Think editing spreadsheets, working with formulas, reading data.openxyl works with Excel files on disk, meaning it reads from or writes to .xlsx files saved on your computer. You don’t need xlwings for this.The example below opens a file, updates it and saves the changes.View the code on Gist.xlswriter Basicsxlswriter creates new Excel files with advanced formatting and charts. It can’t modify existing files. Think generating a report or file from scratch.xlswriter is a standalone Python library. It writes files directly to disk but does not interact with an open Excel application.The example below creates a formatted Excel file.View the code on Gist.Styling and Conditional FormattingStyling makes reports visually appealing and easier to read.You can use openxl to customize your Excel file’s appearance. You can set things like colors, fonts, and borders.View the code on Gist.Conditional formatting highlights trends or important data automatically based on cell values. Use conditional formatting over styling to feature outliers, data thresholds or to make analysis more intuitive.The conditional formatting below highlights all cells greater than 150.View the code on Gist.Automating Reports and DashboardsAutomation is a huge reason why developers love Python, and it translates well to Excel users.Scheduling ScriptsThis is a familiar workflow for anyone who’s scheduled scripts before. You can simply use the cron for Mac/Linux or the Windows Task Scheduler on Windows. This will schedule Python scripts at set times.Emailing Generated WorkbooksSMTP and Python’s built-in smtplib library work together to send your workbooks as an email. They allow your Python script to connect to an email server and send emails with attachments. This integration makes your Excel workflows end-to-end automated.The example below sends an Excel file as an email:View the code on Gist.Performance and Security ConsiderationsManaging Memory and Large DatasetsSimilar to any other large file, when working with large Excel files, performance can degrade quickly if you load everything into memory at once. Since these Excel files are usually pretty large, here are some tips:Read files in chunks: pandas has good tools for this, like pandas.read_csv(..., chunksize=50000) and pandas.read_excel(). This will help process the data in smaller portions, reduce memory and prevent crashes.Leverage databases: You can offload heavy data processing or aggregation to a database (SQL, Postgres) instead of using Python. Then you can fetch summarized or filtered data into Python for analysis or reporting.Avoid unnecessary copies: Favor in-place operations over deep copies unless absolutely necessary when working with DataFrames.Sandboxing and Macro SecurityRunning Python scripts and macros in Excel introduces security risks, especially if the files come from unknown or untrusted sources. Here are some tips:Validate and scan files: Always scan Excel files for malicious macros or embedded scripts before running them.Use virtual environments: In corporate or shared environments, run Python code inside isolated virtual environments (venv or conda) to contain dependencies and reduce the risk of system-wide impacts.Restrict macro settings: Configure Excel’s macro security settings to disable or prompt before running macros from untrusted sources.ConclusionAfter reading this tutorial, you can be well on your way to working with Python in Excel. Though we only scratched the surface of what Python and Excel can do together, this is surely a start. Now you can work smarter, handle bigger data, and create more insightful spreadsheets than ever before.The post Why Combine Python and Excel? appeared first on The New Stack.