Programming lesson
Run-Length Encoding for Image Compression in Python: A Hands-On Guide for COP3502C
Learn run-length encoding (RLE) in Python with real image data. This tutorial covers encoding, decoding, hex conversion, and menu-driven programs—perfect for COP3502C homework 3 & 4.
Introduction to Run-Length Encoding for Image Compression
Run-length encoding (RLE) is a fundamental lossless compression technique used in many industry applications, including image storage and video game sprite rendering. In this tutorial, you will learn how to implement RLE encoding and decoding in Python, working with pixel data represented as lists of integers. This guide directly supports the COP3502C homework assignments on RLE with images, providing clear explanations and practical examples without giving away the full assignment solution.
What is Run-Length Encoding?
RLE replaces consecutive identical values (runs) with a pair: the run length and the value. For example, the flat data [0, 0, 2, 2, 2] becomes [2, 0, 3, 2]—meaning two zeros followed by three twos. This is especially effective for images with large uniform areas, such as pixel art in retro games or simple icons. Think of it like compressing a repeating pattern in a TikTok video filter or a Minecraft block texture: instead of storing every repeated pixel, you store the color once and count repeats.
Why RLE Matters in Modern Computing
RLE is used in image formats like BMP and PCX, and in early video game consoles. Understanding RLE gives you insight into how data compression works at a low level, a skill that transfers to more advanced algorithms like Huffman coding or LZW. With the rise of AI-generated images and high-resolution streaming, efficient compression remains critical. Even your favorite social media apps use similar principles to reduce image file sizes before upload.
Core Python Methods for RLE
Your assignment requires implementing eight methods. Below, we break down each with examples and common pitfalls.
1. to_hex_string(data)
Converts a list of integers (0-15) into a hexadecimal string without delimiters. For example, [3, 15, 6, 4] becomes "3f64". In Python, you can use format(value, 'x') to get a hex digit. Remember that values above 9 become letters a-f. This method is useful for debugging and for displaying flat or RLE data in hex.
def to_hex_string(data):
return ''.join(format(val, 'x') for val in data)
2. count_runs(flat_data)
Returns the number of runs (consecutive equal elements) in the flat data. For [15,15,15,4,4,4,4,4,4], there are two runs: three 15s and six 4s, so it returns 2. This is used to estimate the length of the RLE list. Implement by iterating through the list and counting changes.
def count_runs(flat_data):
if not flat_data:
return 0
runs = 1
for i in range(1, len(flat_data)):
if flat_data[i] != flat_data[i-1]:
runs += 1
return runs
3. encode_rle(flat_data)
Encodes flat data into RLE format. It returns a list where even indices are run lengths and odd indices are values. For the example above, it returns [3, 15, 6, 4]. Be careful: the run length must be stored as an integer, and values are 0-15.
def encode_rle(flat_data):
if not flat_data:
return []
rle = []
count = 1
for i in range(1, len(flat_data)):
if flat_data[i] == flat_data[i-1]:
count += 1
else:
rle.append(count)
rle.append(flat_data[i-1])
count = 1
rle.append(count)
rle.append(flat_data[-1])
return rle
4. get_decoded_length(rle_data)
Given RLE data, returns the total number of pixels after decoding. For [3,15,6,4], it returns 3+6=9. This is simply the sum of all run lengths (every even index).
def get_decoded_length(rle_data):
return sum(rle_data[::2])
5. decode_rle(rle_data)
Decodes RLE data back to flat data. Iterate over the RLE list in steps of 2; for each pair (length, value), append the value length times to the result.
def decode_rle(rle_data):
flat = []
for i in range(0, len(rle_data), 2):
length = rle_data[i]
value = rle_data[i+1]
flat.extend([value] * length)
return flat
6. string_to_data(data_string)
Converts a hex string (e.g., "3f64") into a list of integers. Each two characters represent one byte, but here each hex digit is a pixel value (0-15), so you can iterate over each character. Use int(char, 16).
def string_to_data(data_string):
return [int(ch, 16) for ch in data_string]
7. to_rle_string(rle_data)
Converts RLE data to a human-readable string: run length in decimal, run value in hex, separated by colons. For [15,15,6,4], the output is "15f:64". Note that the length can be 1-2 digits, and the value is always one hex digit.
def to_rle_string(rle_data):
parts = []
for i in range(0, len(rle_data), 2):
length = rle_data[i]
value = rle_data[i+1]
parts.append(f"{length}{format(value, 'x')}")
return ':'.join(parts)
8. string_to_rle(rle_string)
Inverse of #7: parses a string like "15f:64" into RLE data list. Split by ':', then for each part, parse the number (may be 1-2 digits) and the last hex digit.
def string_to_rle(rle_string):
rle = []
for part in rle_string.split(':'):
length = int(part[:-1])
value = int(part[-1], 16)
rle.append(length)
rle.append(value)
return rle
Building the Menu-Driven Program
Your program must present a menu with options to load data, display the image, and show RLE/flat representations. The menu should loop until the user chooses to exit. Use a variable to store the current image data (as a flat list of integers). Here's a skeleton:
def main():
print("Welcome to RLE Image Processor!")
ConsoleGfx.test_rainbow()
image_data = None
while True:
print_menu()
choice = input("Select a Menu Option: ")
if choice == '1':
filename = input("Enter name of file to load: ")
image_data = ConsoleGfx.load_file(filename)
elif choice == '2':
image_data = ConsoleGfx.test_image
print("Test image data loaded.")
elif choice == '3':
rle_str = input("Enter an RLE string to be decoded: ")
rle_data = string_to_rle(rle_str)
image_data = decode_rle(rle_data)
elif choice == '4':
hex_str = input("Enter the hex string holding RLE data: ")
rle_data = string_to_data(hex_str)
image_data = decode_rle(rle_data)
elif choice == '5':
hex_str = input("Enter the hex string holding flat data: ")
image_data = string_to_data(hex_str)
elif choice == '6':
if image_data:
ConsoleGfx.display_image(image_data)
elif choice == '7':
if image_data:
rle_data = encode_rle(image_data)
print("RLE representation:", to_rle_string(rle_data))
elif choice == '8':
if image_data:
rle_data = encode_rle(image_data)
print("RLE hex values:", to_hex_string(rle_data))
elif choice == '9':
if image_data:
print("Flat hex values:", to_hex_string(image_data))
elif choice == '0':
break
Common Pitfalls and Tips
- Data Types: Ensure all pixel values are integers 0-15. When converting hex strings, each character is a separate value.
- Run Length Limits: In standard RLE, runs can be longer than 15. The assignment does not impose a limit, but be aware that some implementations cap runs at 127 or 255.
- Delimiters: In
to_rle_string, the run length is in decimal, the value in hex, and they are concatenated without a delimiter between length and value; only runs are separated by colons. - Testing: Use the provided test files (like
uga.gfx) and the smiley example to verify your output matches exactly.
Connecting to Real-World Trends
RLE isn't just for homework—it's used in retro gaming emulators to compress sprite sheets, in PDF file compression, and even in AI model weight storage when many weights are zero (sparse matrices). For example, the popular game Minecraft uses a form of RLE to store chunk data efficiently. Understanding RLE gives you a foundation for more advanced topics like data compression in streaming services (Netflix, YouTube) where similar run-length ideas appear in video codecs.
Conclusion
By implementing these eight methods and the menu-driven program, you will gain hands-on experience with loops, lists, string manipulation, and type conversion in Python. RLE is a classic algorithm that remains relevant in modern applications. Use this tutorial as a reference to build your own solution for COP3502C homework 3 and 4, but remember to write your own code and test thoroughly. Good luck!