Banner

Let’s do photo editing with python using NumPy and MatPlotLib

Abdella Solomon
7 min readApr 9, 2024

--

Prerequisites- This article assumes a basic understanding of the python programming language and algebra.

Hello, my fellow readers, I am here with another new article to make you aware of how photo editing softwares work through editing some photos programmatically with python. I will keep releasing articles about applied mathematics to simplify concepts for my audiences. Follow my page if you haven’t done it yet… that being said, let’s dive into the article.

What is this article about?

This article tends to explain how photo editing software works in the background. It will help you understand how mathematics is effectively applied in this field.

First of all, I want to mention that, I am going to use python along with its libraries called Numpy(For the mathematics part) and MatPlotLib(To draw the edited pictures).

While getting back to our point, images are represented as a matrix(A collection of vectors). So transforming matrices in mathematics is called Matrix Transformation. Theoretically speaking, Matrix transformation is covered under Linear Algebra. No worries if you don’t have any idea about linear algebra. You can still keep enjoying the article. By generalizing this, photo editing is an effective application of linear algebra.

The applications included in this article are:
1. cropping, 2. resizing, 3. rotating, and 4. flipping a photo. Some of them will just involve logic and basic mathematics only and some others will involve linear algebra. I will try to detail each of them below.

Cropping an image

Firstly, let’s import the required libraries. Namely numpy and matplotlib and load the image that we are going to work with.

import numpy as np
import matplotlib.pyplot as plt
img = plt.imread('cat.jpg')
plt.imshow(img)
Output

We are going to play around with this cute cat image in this article. To show you what I said earlier(An image is a matrix), look at the bottom picture. As you can see, one array represents the RGB color code of that pixel. A collection of pixels make up this image.

Fascinating right? In fact, the collection of these pixels in vectors and vectors in the matrix makes up the width and height dimensions. The image has a dimension of 2560x1731.
Anyways, Let’s now crop this image to 1200x1200 starting from the bottom right corner.

crop_x, crop_y = 1200, 2000
img_crop = img[:crop_x, :crop_y, :]
plt.imshow(img_crop)
Output

As you can see, it started cropping from the bottom right corner toward the top left corner. Let’s make the image crop the image from all corners toward the center.

x, y, z = img.shape # original image shape
center_x, center_y = x // 2, y // 2
img_crop = img[center_x - crop_x // 2: center_x + crop_x // 2, center_y - crop_y // 2: center_y + crop_y // 2, :]
plt.imshow(img_crop)
Output

Cropping the image is not involving a linear algebra application. It is just a plain algebra 3 method. Cropping the image means just removing some of the vectors from the image to reduce the width and height and cropping happens. In the first case, we started to crop from the bottom right corner toward the top left corner and in the second case, we started to crop from every side toward the center.

Resizing an image

Let’s first write a function that makes the resizing process for us given the image, width, and height we want.

def resize(img, w, h):
rows, cols, _ = img.shape
img_resized = []
    for r in range(w):
i = []
for c in range(h):
i.append(img[int(rows*r / w)][int(cols*c / h)])
img_resized.append(i)
img_resized = np.array(img_resized)
return img_resized

Again, this does also not involve linear algebra. This function is based on logic and basic mathematics. It is just about stretching and shrinking the size of the image by copying pixels when we try to resize for a size that is higher than the original and removing some pixels when the size we want is smaller than the original. Below is the math equation for the above function.

The resizing equation

Let’s resize the image to 400x700

new_x, new_y = 400, 700
img_resized = resize(img, new_x, new_y)
plt.imshow(img_resized)
Output

Let’s increase the size of the image more than it was before, to 5000x3000.

new_x, new_y = 5000, 3000
img_resized = resize(img, new_x, new_y)
plt.imshow(img_resized)

Rotating an image

Rotating an image is something that requires a linear algebra application called matrix rotation. We are going to apply that concept here programmatically. Let’s create a python function for the rotation purpose

def rotate(img, angle):
rows, cols, _ = img.shape
img_rotated = np.zeros((rows, cols, 3), dtype=np.uint8)
# let me pre initialize the cos and sin variable to increase the performance of this function.
cos_a = np.cos(angle)
sin_a = np.sin(angle)
for r in range(rows):
for c in range(cols):
x = r - rows // 2
y = c - cols // 2
# The below two lines are the linear algebra parts.
x_new = x * cos_a - y * sin_a
y_new = x * sin_a + y * cos_a
x_new = int(x_new + rows // 2)
y_new = int(y_new + cols // 2)
if x_new >= 0 and x_new < rows and y_new >= 0 and y_new < cols:
img_rotated[r][c] = img[x_new][y_new]
return img_rotated

This function is very long. So, it is harder to provide the mathematical version of this function. The linear algebra part is where I wrote the comment about it in the function. Let me just show you the mathematical representation of that part(vector).

Rotation vector

The above vector is used for the matrix rotation. We are doing the rotation on that specific part. That being said, let’s get this function to test.

Let’s rotate the image 180 degrees. Which is equal to π when we convert it to radian. This function accepts a radian-measured angle. So, we will provide π to the function.

angle180 = np.pi
img_rotated = rotate(img, angle180)
plt.imshow(img_rotated)
Output

It is amazing, right? Roughly speaking, we are just moving the whole pixel with the angle required. One more try, let’s rotate it just 60 degrees now.

angle60 = np.pi / 3
img_rotated = rotate(img, angle60)
plt.imshow(img_rotated)
Output

Flipping an image

Flipping an image doesn’t involve linear algebra. Rather, it just involves basic mathematics. We are just moving pixels on the absolutely opposite side for each pixel with respect to the distance for the left and right border. For example, from the top right corner to the top left corner. Well, let’s write a python function to do the flipping.

def flip(img):
rows, cols, _ = img.shape
img_flipped = np.zeros((rows, cols, 3), dtype=np.uint8)
for r in range(rows):
for c in range(cols):
img_flipped[r][c] = img[r][cols - c - 1]
    return img_flipped

Basically, this function is displacing pixels as I mentioned earlier. Firstly, the below image is the original unflipped image.

Let’s flip it now!

img_flip = flip(img)
plt.imshow(img_flip)
Output

Look, the image is perfectly flipped. That’s amazing. Isn’t it? That being said, let’s wrap up this article here.

That’s all for now. I will try to add more applications in the future. If you want the whole Jupiter notebook you can find it on my github page here. Let me know your suggestions and feedback in the comment section.
If you enjoyed the article, I guess I deserve a follow and a clap. Please share this article with your friends and family. Stay tuned!

My pages are Twitter Medium LinkedIn Telegram GitHub

Related articles from the author

--

--