Pixboost Logo mobile menu icon LOGIN
chat

Choosing between lossy and lossless conversion for PNG Published on: Oct 23, 2021

Png transformation

PNG and JPEG are the two oldest image formats that web developers use to add graphics to their web pages. Web standards came a long way, and often we use next-generation formats, such as WebP/AVIF/JpegXL, to deliver an image to the end-user. But JPEG and PNG are still the source of truth most of the time.

In this article, I’ll explain why converting PNG to the next-gen format is not as straightforward and how we went about it.

It’s also the 1st of November, and very appropriate to talk about images!

Today is Nov 1. That means we can celebrate RFC 1866, or when the <IMG> tag was introduced to the HTML spec.

๐Ÿ”ธ "The <IMG> element refers to an image or icon via a hyperlink (see 7.3, "Simultaneous Presentation of Image Resources")."

A few links to keep you busy today: ๐Ÿงต โฌ‡๏ธ

— Henri Helvetica v2.2 ๐Ÿ‘จ๐Ÿพโ€๐Ÿš€ ๐Ÿ‡ญ๐Ÿ‡น (@HenriHelvetica) November 1, 2021

We use Go and MagickWand library in the code snippets. You can get the latest version of the library using ImageMagick docker image that I maintain.

Our opensource Image CDN “transformimgs” that is available on Github is using the approach from this blog, and it works quite well in production.

Let’s get to it!

PNG

Let’s see what the main difference between JPEG and PNG are and when we use each.

Firstly, PNG is a lossless format, meaning when you compress/uncompress the image it doesn’t change. This is like a ZIP archive that doesn’t change the source data after you extracted it. On the contrary, the resulting image is not the same when using JPEG. So, if it is important to display exactly the same image then PNG will be our choice. However, it’s rarely a requirement for the Web, rather we want our images to look good. JPEG could introduce visual artefacts to the image, and usually, the images with fewer details and colours are affected the most. Generally, it makes sense to use lossless compressions for sharp images with few colours. The examples could be logos, banners, illustrations. We’ll look at some examples in the next section

Secondly, PNG supports transparency. If you’d like to show a background of the page on some parts of the image then you make those parts transparent. The popular use case is product images on online shops.

Keeping the above in mind it shouldn’t be a problem to choose the format for the source image. Though, when we use modern formats to deliver the image, they support both lossy and lossless encoding, and we can pick which one to use.

So, here comes the elephant problem:

When is it better to use lossy or lossless compression when converting PNG to the next generation format?

Lossy or Lossless?

Lossy compressed images are smaller compared to lossless, but they also might have visual artifacts and glitches. Let’s take a look at some examples.

We use ImageMagick with WebP format here because it’s supported by all browsers: magick image.png -define webp:lossless=[true/false] image.webp.

Example 1. Illustration

Original PNG image:

illustration with people

Original PNG - 294Kb

When converting to:

The difference is 8 Kb which is 3% of the original image. Now, let’s zoom in a bit and see what happened to quality:

zoomed in lossy image

Lossy WebP - 100Kb

zoomed in lossless image

Lossless WebP - 108Kb

The lossless version is the same as the original, and we could see that the lossy image became smoother and also have some artifacts around the red notebook and shadow from the sweater.

The conclusion for us here was - 8Kb doesn’t worth quality loss, and we would prefer lossless compression over lossy in this case.

Example 2. Photo

Let’s look at the other example where PNG used for transparency:

photo of a chair and a desk

Original PNG - 276Kb

When converting to:

The difference here is 127Kb which is 46% of the original image.

Let’s compare zoomed-in fragments:

zoomed in lossy image

Lossy WebP - 16Kb

zoomed in lossless image

Lossless WebP - 143Kb

There is a visible glitch in the plant’s pot texture, but most likely it will be lost to a human eye due to the number of details and colours. The difference in the size is huge, and the verdict is lossy compression would be much preferable in this case.

After running the experiments above on the bunch of PNGs, the requirement distilled itself:

We would like to distinguish between photos and illustration/logos then use lossy compression for the first and lossless for the latter

Solution

After the brainstorming session, we tabled two conceptually different approaches:

  1. Use machine learning (we are a startup after all). We have a good dataset, so we could train a model and use it. There are a few cons:
    • We don’t have an ML expert on the Team.
    • Deployment would be complicated. ML model + training and using it in the application.
    • How do you fix bugs in it?
    • What if it will break free and will take over our servers/planet??!!!
  2. Write a boring algorithm (we are a profitable startup after all) that would use image statistics. There are some cons here as well:
    • Might be less accurate
    • Analysing images are memory and CPU intensive and could be a showstopper in the case of Image CDN where images should be processed in a reasonable time on the first request.

We decided to go with option number 2 and fallback to the first if performance will become a problem.

Implementation

The next two sections are all about technical implementation. TLDR: The code works and is currently deployed in production. You could jump to the results section if going through the walls of code is not your cup of tea.

Before diving into the implementation we picked different types of PNGs that we want to classify. Using them we wrote a table unit test:

var isIllustrationTests = []*testIsIllustration{
	{"illustration-1.png", true},
	{"illustration-2.png", true},
	{"illustration-3.png", true},
	{"logo-1.png", true},
	{"logo-2.png", true},
	{"banner-1.png", false},
	{"screenshot-1.png", false},
	{"photo-1.png", false},
	{"photo-2.png", false},
	{"photo-3.png", false},
	{"product-1.png", false},
	{"product-2.png", false},
}

func TestImageMagick_IsIllustration(t *testing.T) {
	for _, tt := range isIllustrationTests {
		imgFile := tt.file

		f := fmt.Sprintf("%s/%s", "./test_files/is_illustration", imgFile)

		orig, err := ioutil.ReadFile(f)
		if err != nil {
			t.Errorf("Can't read file %s: %+v", f, err)
		}

		image := &img.Image{
			Id:       imgFile,
			Data:     orig,
			MimeType: "",
		}
		info, err := proc.LoadImageInfo(image)
		if err != nil {
			t.Errorf("could not load image info %s: %s", imgFile, err)
		}

		if err != nil {
			t.Errorf("Unexpected error [%s]: %s", imgFile, err)
		}
		if info.Illustration != tt.isIllustration {
			t.Errorf("Expected [%t] for [%s], but got [%t]", tt.isIllustration, imgFile, info.Illustration)
		}
	}
}

Now, we need to make it green :)

After digging the Internet, we found a very good article on image classification using ImageMagick. There was a solution from Jim Van Zandt:

We’ve also reached out to the ImageMagick community and had quite a few very useful suggestions in this discussion thread.

That was a good starting point, so we implemented the algorithm, but got mixed results. The original approach intended to work on cartoon images, however, we would also want to include illustrations that have more colours and could be more complex than drawings.

However, it felt that idea of looking at 50% of the image is a step in the right direction, but the stat we make the decision on doesn’t behave exactly as we wish. After several hours of digging deeper and looking at numbers, we figured that instead of comparing the number of pixels to the number of colours it would be better to compare the ratio of the number of colors needed for 50% divided by the total number of colours. Here is the first implementation:

func isIllustration(img []byte) (bool, error) {
	mw := imagick.NewMagickWand()

	err := mw.ReadImageBlob(img)
	if err != nil {
		return false, err
	}

	colorsCnt, colors = mw.GetImageHistogram()

	// Sorting colors by number of occurrences. 
	colorsCounts := make([]int, colorsCnt)
	for i, c := range colors {
		colorsCounts[i] = int(c.GetColorCount())
	}

	sort.Sort(sort.Reverse(sort.IntSlice(colorsCounts)))

	var (
		colorIdx         int
		count            int
		imageWidth       = mw.GetImageWidth()
		imageHeight      = mw.GetImageHeight()
		pixelsCount      = 0
		totalPixelsCount = float32(imageHeight * imageWidth)
		fiftyPercent     = int(totalPixelsCount * 0.5)
	)

	// Going through colors until reach 50% of all pixels
	for colorIdx, count = range colorsCounts {
		if pixelsCount > fiftyPercent {
			break
		}

		pixelsCount += count
	}

	colorsCntIn50Pct := colorIdx + 1

	// Calculate ratio between number of colors used for 50% of the image and 
	// make a decision based on that.
	return (float32(colorsCntIn50Pct)/float32(colorsCnt)) <= 0.02, nil

}

It worked for all test images except for the cases with a background. What we wanted is to exclude background from our calculation. We wrote a simple algorithm that removes the background colour:

That solved our unit tests, and we were ready to take a new version for a spin on the bigger scale!

We downloaded 300 most popular PNGs that are currently optimised through Pixboost and ran them through the func. Then we compared results manually and made some minor tweaks to increase accuracy, which is now in between 98-99%.

Ready for production, we thought. However, we ran all the tests on powerful laptops and once moved to servers and put under the load we realised one thing:

It ate all the memory!

Performance

Image processing is a resource intensive task. The MagickWand library we use builds an in-memory tree(cube) to calculate image histogram (number of colours). The tree grows proportionally to the number of colours used. At first, we thought there was a memory leak, and we spent a lot of time trying to fix it. We failed in the end because there was no memory leak, but Golang and Linux are very smart on when to release memory of the process.

But we still had a memory problem to solve. We identified 2 hotspots where memory footprint increased dramatically:

if float32(len(imgData))/float32(imgWidth*imgHeight) > 1.0 {
	return false, nil
}
	err := mw.ReadImageBlob(imgData)
	if err != nil {
		return false, err
	}

	if imgWidth*imgHeight > 500*500 {
		aspectRatio := float32(imgWidth) / float32(imgHeight)
		err = mw.ScaleImage(500, uint(500/aspectRatio))
		if err != nil {
			return false, err
		}
	}

	colorsCnt, colors = mw.GetImageHistogram()

We’ve also added a few more optimisations that helped us to reduce the execution time and memory consumption further. Here is the final result that you could also find on Github:

// isIllustration returns true if image is cartoon like, including
// icons, logos, illustrations.
//
// It returns false for banners, product images, photos.
//
// We use this function to decide on lossy or lossless conversion for PNG when converting
// to the next generation format.
//
// The initial idea is from here: https://legacy.imagemagick.org/Usage/compare/#type_reallife
func (p *ImageMagick) isIllustration(src *img.Image, info *img.Info) (bool, error) {
	if len(src.Data) < 20*1024 {
		return true, nil
	}

	if len(src.Data) > 1024*1024 {
		return false, nil
	}

	if float32(len(src.Data))/float32(info.Width*info.Height) > 1.0 {
		return false, nil
	}

	var (
		colors    []*imagick.PixelWand
		colorsCnt uint
	)

	mw := imagick.NewMagickWand()

	err := mw.ReadImageBlob(src.Data)
	if err != nil {
		return false, err
	}

	if info.Width*info.Height > 500*500 {
		aspectRatio := float32(info.Width) / float32(info.Height)
		err = mw.ScaleImage(500, uint(500/aspectRatio))
		if err != nil {
			return false, err
		}
	}

	colorsCnt, colors = mw.GetImageHistogram()
	if colorsCnt > 30000 {
		return false, nil
	}

	colorsCounts := make([]int, colorsCnt)
	for i, c := range colors {
		colorsCounts[i] = int(c.GetColorCount())
	}

	sort.Sort(sort.Reverse(sort.IntSlice(colorsCounts)))

	var (
		colorIdx         int
		count            int
		imageWidth       = mw.GetImageWidth()
		imageHeight      = mw.GetImageHeight()
		pixelsCount      = 0
		totalPixelsCount = float32(imageHeight * imageWidth)
		tenPercent       = int(totalPixelsCount * 0.1)
		fiftyPercent     = int(totalPixelsCount * 0.5)
		hasBackground    = false
	)

	for colorIdx, count = range colorsCounts {
		if colorIdx == 0 && count >= tenPercent {
			hasBackground = true
			fiftyPercent = int((totalPixelsCount - float32(count)) * 0.5)
			continue
		}

		if pixelsCount > fiftyPercent {
			break
		}

		pixelsCount += count
	}

	colorsCntIn50Pct := colorIdx + 1
	if hasBackground {
		colorsCntIn50Pct--
	}

	return colorsCntIn50Pct < 10 || (float32(colorsCntIn50Pct)/float32(colorsCnt)) <= 0.02, nil
}

The code we ended up with is straightforward which means it will be easy to maintain and improve in the future.

At the end of the day, we still had to bump the type of instances in our clusters from 8 to 16Gb which made everyone happy.

Results

In the greater scheme of things PNGs only account for 5% of all Pixboost traffic, but:

Once released a canary version to production, we ran 2 tests on different data sets that again included the most popular processed images. We compared the current production version output with a new one:

Converted PNGs became 3 times smaller.

Conclusion

It’s been fun working on this feature which involved a lot of discovery, testing and performance tuning. After all, our main goal is to deliver the best images to the users, and we could accomplish it in this case. Happy days and we look forward to improving the service further.

And you could all try it yourself using our Open Source version or SaaS offering!