I'm building a Huffman compressor in Java.
I already have: The original text, the Huffman code table (Map<Character, String>), and the order of character appearance.
My current goal is to write the compressed result into a .bin file.
However, the output file is larger than the original text, because each '0' or '1' bit is being stored as a full byte instead of being packed into real bits.
Here’s my current implementation:
private static byte[] convertBitsToBytes(String bits) {
int len = bits.length();
int numBytes = (int) Math.ceil(len / 8.0);
byte[] bytes = new byte[numBytes];
for (int i = 0; i < len; i++) {
if (bits.charAt(i) == '1') {
bytes[i / 8] |= (byte) (1 << (7 - (i % 8)));
}
}
return bytes;
}
public static void saveBinaryFile(File file, String originalText, Map<Character, String> huffmanTable) {
try (FileOutputStream fos = new FileOutputStream(file)) {
for (char c : originalText.toCharArray()) {
String huffmanCode = huffmanTable.get(c);
if (huffmanCode != null) {
fos.write((byte) c);
byte[] compressedBytes = convertBitsToBytes(huffmanCode);
fos.write(compressedBytes);
}
}
System.out.println("Binary file saved successfully.");
} catch (IOException ex) {
System.out.println("Error saving binary file: " + ex.getMessage());
}
}
I tried writing both the character and its Huffman binary string directly into the binary file.
Each bit ('0' or '1') was written as a full byte, using DataOutputStream.writeByte().
I expected the resulting .bin file to contain the original character followed by its compressed bit sequence — and overall to weigh less than the original text file from which the data was taken.
However, the file ended up larger than the text file, because each '0' and '1' is still stored as one byte instead of real bits.
I’m trying to find a way to make it truly compressed by packing the bits efficiently. Result should be "h111e10l10l10o0" in the binary file imaging I had those binary codes.
Not use yet; why haven't you tried writing to your output file from what it returns?