Working with Excel Files in Python Using `openpyxl`

Table of Contents

  1. Introduction
  2. Prerequisites
  3. Installing openpyxl
  4. Working with Excel Files
    1. Creating a New Excel File
    2. Loading an Existing Excel File
    3. Reading Data from Excel
    4. Writing Data to Excel
    5. Modifying Excel Files
    6. Saving and Closing Excel Files
  5. Conclusion

Introduction

Excel files are widely used for storing and analyzing data. With the openpyxl library in Python, we can easily manipulate Excel files programmatically. This tutorial will guide you through the process of working with Excel files using openpyxl. By the end of the tutorial, you will be able to create, read, write, and modify Excel files using Python.

Prerequisites

Before getting started, you should have a basic understanding of Python programming concepts. It would also be helpful to have some knowledge of how Excel files are structured and some familiarity with using Excel.

Installing openpyxl

To begin working with Excel files in Python, we need to install the openpyxl library. Open your terminal or command prompt and run the following command to install it using pip: python pip install openpyxl Once the installation is complete, we can start using the library in our Python programs.

Working with Excel Files

Creating a New Excel File

To create a new Excel file, we first need to import the necessary module from openpyxl. Open a new Python script and add the following code: ```python from openpyxl import Workbook

# Create a new workbook
workbook = Workbook()

# Select the active sheet
sheet = workbook.active

# Add data to the sheet
sheet["A1"] = "Hello"
sheet["B1"] = "World!"

# Save the workbook
workbook.save("new_excel_file.xlsx")
``` In this example, we imported the `Workbook` class from `openpyxl` and created a new workbook using the `Workbook()` constructor. We then selected the active sheet using the `active` attribute of the workbook. Next, we added data to cells `A1` and `B1` of the sheet. Finally, we saved the workbook using the `save()` method, specifying the file name as "new_excel_file.xlsx".

Loading an Existing Excel File

To load an existing Excel file, we can use the load_workbook() function from openpyxl. Let’s modify our previous example to load an existing Excel file instead of creating a new one: ```python from openpyxl import load_workbook

# Load an existing workbook
workbook = load_workbook("existing_excel_file.xlsx")

# Select the active sheet
sheet = workbook.active

# Display data from the sheet
print(sheet["A1"].value)  # Output: Hello
print(sheet["B1"].value)  # Output: World!
``` In this example, we imported the `load_workbook` function from `openpyxl` and used it to load an existing Excel file named "existing_excel_file.xlsx". We then selected the active sheet and printed the values of cells `A1` and `B1` to verify that the data was loaded correctly.

Reading Data from Excel

Now that we know how to load an existing Excel file, let’s learn how to read data from it. We can access the value of a specific cell using its coordinates. Consider the following example: ```python # Load an existing workbook workbook = load_workbook(“data.xlsx”)

# Select the active sheet
sheet = workbook.active

# Read data from multiple cells
cell_a1 = sheet["A1"]
print(cell_a1.value)  # Output: John Doe

cell_b1 = sheet["B1"]
print(cell_b1.value)  # Output: [email protected]

# Read data from a range of cells
data_range = sheet["A2:B4"]
for row in data_range:
    for cell in row:
        print(cell.value)
``` In this example, we loaded an Excel file named "data.xlsx" and selected the active sheet. We then accessed the values of cells `A1` and `B1` individually. Finally, we read the data from a range of cells (in this case, `A2:B4`) using nested loops. The inner loop iterates over each cell in the current row, allowing us to access and print its value.

Writing Data to Excel

Writing data to Excel can be done by assigning a value to a specific cell or a range of cells. Let’s see how we can do it: ```python # Create a new workbook workbook = Workbook()

# Select the active sheet
sheet = workbook.active

# Write data to a single cell
sheet["A1"] = "Python"
sheet["B1"] = "Programming"

# Write data to a range of cells
data = [
    ["Alice", 25],
    ["Bob", 30],
    ["Charlie", 35],
]

for row_index, row_data in enumerate(data, start=2):
    for column_index, value in enumerate(row_data, start=1):
        sheet.cell(row=row_index, column=column_index, value=value)

# Save the workbook
workbook.save("data.xlsx")
``` In this example, we created a new workbook and selected the active sheet. We then assigned values to cells `A1` and `B1` individually. Additionally, we wrote data to a range of cells using nested loops. The outer loop iterates over each row in the `data` list, and the inner loop iterates over each value in the current row. We used the `cell()` method to specify the row and column indices and assign the corresponding value.

Modifying Excel Files

Apart from reading and writing data, openpyxl allows us to modify various aspects of Excel files, such as formatting, merging cells, and creating charts. Let’s take a look at an example: ```python # Load an existing workbook workbook = load_workbook(“data.xlsx”)

# Select the active sheet
sheet = workbook.active

# Formatting cells
sheet["A1"].font = Font(size=14, bold=True)
sheet["B1"].font = Font(size=14, italic=True)

# Merging cells
sheet.merge_cells("A2:B2")
sheet["A2"] = "Personal Information"

# Creating a chart
values = Reference(sheet, min_col=2, min_row=2, max_col=2, max_row=4)
chart = BarChart()
chart.add_data(values)
chart.title = "Ages"
chart.x_axis.title = "Name"
chart.y_axis.title = "Age"
sheet.add_chart(chart, "D2")

# Save the modified workbook
workbook.save("modified_data.xlsx")
``` In this example, we loaded an existing Excel file named "data.xlsx" and selected the active sheet. We then applied formatting to cells `A1` and `B1`, merging them to create a title. Next, we created a bar chart and added it to the sheet at cell `D2`. Finally, we saved the modified workbook as "modified_data.xlsx".

Saving and Closing Excel Files

To save and close an Excel file, we use the save() and close() methods, respectively. Here’s an example: ```python # Load an existing workbook workbook = load_workbook(“data.xlsx”)

# Select the active sheet
sheet = workbook.active

# Modify data or perform other operations

# Save the modified workbook
workbook.save("modified_data.xlsx")

# Close the workbook
workbook.close()
``` In this example, we loaded an existing workbook, performed some operations, such as modifying data or applying formatting, and then saved the modified workbook using the `save()` method. Finally, we closed the workbook using the `close()` method.

Conclusion

In this tutorial, we explored how to work with Excel files in Python using the openpyxl library. We learned how to create new Excel files, load existing ones, read data from cells and ranges, write data to cells, and modify Excel files. We also covered additional features such as formatting cells, merging cells, and creating charts. With this knowledge, you can start automating tasks related to Excel files and efficiently process data using Python.