Skip to main content
We’ve updated our Terms of Service. A new AI Addendum clarifies how Stack Overflow utilizes AI interactions.

Questions tagged [compression]

Filter by
Sorted by
Tagged with
5 votes
5 answers
771 views

I often see the claim that various data warehouse/analytical database systems derive significant performance benefits from compressing their data stores. On the face of it, though, this seems to be ...
Mason Wheeler's user avatar
4 votes
2 answers
754 views

We store zipped files in the storage of a cloud provider which contain certain fields (metadata). These files are derived from other, larger files. Every time we (re)generate these files, their 'last ...
MPIchael's user avatar
  • 269
0 votes
5 answers
293 views

We have a huge amount of queries hitting our API that request a minor or major extract of some huge files lying around on our mounted hard drives. The data needs to be extracted from the files and ...
glades's user avatar
  • 493
30 votes
7 answers
17k views

If all data is essentially just a bit string, then all data can be represented as a number. Because a compression algorithm, c(x), must reduce or keep the same length of the input, then the compressed ...
Mercury's user avatar
  • 475
1 vote
1 answer
2k views

I have a jar file, for example foo.jar. My code contains a lot of libraries (almost 75 jar dependencies). I am not using anything like maven or gradle, I'm just using pure java with pure jar files as ...
Day Trip's user avatar
2 votes
1 answer
1k views

I'm trying to make a program that produces pdf files. I've been studying the pdf format specification and specific pdf files whose format I'm trying to mimic. I found the line /FlateDecode in these ...
Zoltán Király's user avatar
0 votes
1 answer
388 views

I am writing a c# program where I need to print a lot of small barcodes in a 100x100 grid on a piece of paper. I then scan/photograph the paper and read the barcodes again. Each barcode only need to ...
DrDress's user avatar
  • 127
4 votes
3 answers
283 views

Regarding cryptography and the issue of collisions, I posed a question as to whether it was ever possible to store every single possible combination of a bit array of a particular size, in a bit array ...
Anon's user avatar
  • 3,649
7 votes
2 answers
645 views

What is difference between Average length of codes and Average length of codewords in Huffman Algorithm? is both the same meaning? I get stuck in some facts: I see a fact that marked as False: for a ...
Emma Nic.'s user avatar
  • 183
0 votes
1 answer
90 views

I have a multiple files (one per CountryCode) which all get ~5000 entries added to it per day. Each entry in the file looks like (256chars max): {countryCode_customerId:{"ownerId": "...
sync101's user avatar
7 votes
2 answers
887 views

Software libraries targetting resource constrained environments like embedded systems use conditional compilation to allow consumers to shave space by removing unused features from the final binaries ...
TZubiri's user avatar
  • 443
-2 votes
2 answers
590 views

I have 3 number arrays that I need to encode into a URL through query parameters. Example: http://localhost:3000/?r=133223333302302040&y=10000000000000000000&b=13333332002100122331 This is a ...
Simon's user avatar
  • 127
2 votes
2 answers
571 views

Today I went across a weird case for which I have no explanation, so here I am. I have two files with identical content, but one is encoded in UTF-8 and the other one is in IBM EBCDIC. Both of them ...
rodripf's user avatar
  • 137
11 votes
4 answers
2k views

From my experience, sql code changes almost always tend to be NOT incremental: someone creates a new stored procedure, or modifies an entire embedded sql query for optimization purposes, or creates a ...
CEGRD's user avatar
  • 235
0 votes
2 answers
139 views

I have thinking about this idea for over 5 years and i don't have the complete technical knowledge to fully grasp the idea I'm having. The premise of the idea is to have an extremely high base number ...
Necro's user avatar
  • 105
6 votes
1 answer
9k views

Two things you can do in Java: Send a gzipped JSON body in response to an HTTP request Send a StreamingOutput response to an HTTP request, where you begin sending a response before you know the ...
Malcolm Crum's user avatar
30 votes
5 answers
8k views

This question is about how many bits are required to store a range. Or put another way, for a given number of bits, what is the maximum range that can be stored and how? Imagine we want to store a ...
rghome's user avatar
  • 688
6 votes
1 answer
321 views

I am currently attempting to create a gravitational n-body simulation using a modified Barnes-Hut algorithm, to be more amicable to GPU computation. This is primarily as a learning project. My goal is ...
john01dav's user avatar
  • 889
3 votes
2 answers
135 views

I'm a high school student interested in topics of computer programming. Recently I became interested in file compression, and in my head I tried to combine this with a completely different part of ...
Goel Nimi's user avatar
0 votes
1 answer
547 views

A short introduction to the problem: I'm working with a small database where I have a table of strings (web URLs, to be precise) as pairs: hash|string. Another table references these strings by hash ...
Violet Giraffe's user avatar
2 votes
3 answers
2k views

I am looking for text compression algorithms (natural language compression, rather than compression of arbitrary binary data). I have seen for example An Efficient Compression Code for Text ...
Lance Pollard's user avatar
5 votes
2 answers
16k views

I wrote a websocket server in Spring Boot and a client in Javascript. These work fine. I also wrote a second client in Java. When this one attempts to handle a frame after connecting to the host, I ...
Jeff's user avatar
  • 1,874
-2 votes
1 answer
197 views

Is it possible to compress true random permutation using low order polynomial interpolation? If yes, how it can be achieved?
user9340043's user avatar
7 votes
4 answers
9k views

I have an array of unique integers, for example: {1,3,,7,9,31,46,...}, which I want to compress. I have found compression techniques and algorithms for the list of integers, where some integers are ...
Angela's user avatar
  • 99
0 votes
1 answer
492 views

I was wondering if what I have in mind already exists in any known compression programs/algorithms or not. We know that Seed gives us constant sequence of random numbers. so if we be able to find seed ...
M.kazem Akhgary's user avatar
2 votes
1 answer
398 views

I made a test today, how good WinRAR can compress a folder with several times the same picture in it. For that I just put one picture with 300 kB into a folder and copied it there 11 times, so that I ...
jusaca's user avatar
  • 175
4 votes
1 answer
371 views

This question was inspired by MessagePack, but I'm looking for a general answer about the advantages of in-app vs. external compression. For network I/O, doesn't the transport protocol (at least ...
Kevin Krumwiede's user avatar
36 votes
5 answers
25k views

As a web developer I have very little understanding of binary data. If I take the sentence "Hello world.", convert it to binary, and store it as binary in an SQL database, it seems like the 1s and ...
john doe's user avatar
  • 1,007
2 votes
1 answer
105 views

I have a stream of binary data. Assume no prior knowledge about the expected pattern in input data. The symbols can represent binary data or other symbols, hence hierarchical. The output should ...
Quark's user avatar
  • 37
2 votes
0 answers
107 views

Consider an alphabet of k symbols and a requirement to optimally encode a series of values of known frequency. The obvious choice for this is to use Huffman coding, which is known to be optimal for ...
Periata Breatta's user avatar
4 votes
5 answers
639 views

A while ago I asked a question about custom text data formats, instead of using existing tools such as XML, JSON, YAML, etc. Now, in favor of converting our custom format to a relational database and ...
Chris Cirefice's user avatar
-1 votes
2 answers
9k views

I'm working on a project that requires a TCP connection between a client and server. The current protocol encodes the data into hex and then sends it. However, hex increases the length of the payload ...
Awn's user avatar
  • 155
3 votes
1 answer
521 views

Is there a possible test to check if a PDF file contains text or it is created by scanning paper sheets ? text : plain text that, for example, I can copy & paste while I am reading the PDF. Not ...
Massimo's user avatar
  • 131
14 votes
2 answers
4k views

I have a video coming from a stationary camera. Both the resolution and the FPS are quite high. The data I get is in Bayer format and uses 10 bit per pixel. As there's no 10 bit data type on my ...
Headcrab's user avatar
  • 243
5 votes
2 answers
585 views

I have a collection of strings which have a lot of common substrings, and I'm trying to find a good way to define tokens to compress them. For instance, if my strings are: s1 = "String" s2 = "Bool" ...
ErikR's user avatar
  • 296
2 votes
1 answer
94 views

following problem: I need to save a lot of xml strings of variable length and structure. As it is with xml, a lot of substrings are the same (some elements, attribute and value combination). Often the ...
mietzekotze's user avatar
9 votes
2 answers
1k views

I have a folder containing about 9,000 JPEG photos (about 30Gb), which I want to archive with some sort of compression. I understand that compressing JPEGs is not normally very effective, but these ...
Stephen's user avatar
  • 201
4 votes
2 answers
986 views

The Burrows-Wheeler-Transform takes a string with length n, creates a matrix with n rows by shifting this string one position to the left for each row. Then the rows are sorted by the first column in ...
ooorndtski's user avatar
1 vote
1 answer
1k views

I need to compress an id for marketing campaigns. The current campaign id is 32-bit integer but obviously this is too long for a customer to type by hand. I would like to compress this to minimum ...
user594883's user avatar
1 vote
0 answers
326 views

In Run Length Encoding (RLE), a large set of information is encoded by storing the quantity of consecutive sequences. A canonical example is: ...
Matt Johnson-Pint's user avatar
3 votes
2 answers
277 views

I am looking for algorithm or idea for the following problem. Suppose we have a data type, say 64-bit integer. Now we have a relatively small set of such items, say few hundred at most. The simplest ...
haael's user avatar
  • 133
4 votes
1 answer
447 views

Note: this question has been re-written to simplify and generalize the problem. The original is available below. Suppose I created a simple compression scheme for lists of 2-digit numbers. It has 2 ...
Duke Nukem's user avatar
1 vote
0 answers
2k views

I am writing a python program which parses zip (currently only zlib, using DEFLATE compression) files and verifies the correctness of their headers and data. One of the things I'm trying to achieve is ...
S B's user avatar
  • 11
2 votes
2 answers
4k views

I have an application running behind a proxy, both on the same machine. Which approach is more suited regarding compression, while preserving reasonable performance. turn on compression at the ...
redben's user avatar
  • 263
2 votes
2 answers
2k views

When using Golomb/Rice code in image compression, it is inevitable for us to meet large values. Golomb coding uses a tunable parameter M to divide an input value N into two parts : q, the result of a ...
dongbao wu's user avatar
1 vote
4 answers
228 views

Recently I was looking for a program that will run as a daemon and find files that have the same size/type, check if they're the same, then make both a hard link to a single copy if they are. And I ...
Patrick Jeeves's user avatar
2 votes
2 answers
2k views

I've recently come across an application by Yahoo called SmushIt. Apparently it does lossless compression on images. Sometimes the image size is reduced by as much as 90%. This of course has major ...
Alternatex's user avatar
  • 1,031
5 votes
2 answers
11k views

Let's start out with an example [1,1,1,5,3,1,1,2,78,2,3,1,1,...,1] As you can tell by the example, 1 is repeated a lot, but there will be outliers (like 78, and really anything that isn't 1). The ...
Michael King's user avatar
1 vote
2 answers
2k views

I had a question regarding compression and calculation of checksum/hash of data. I would like to know if checksum has to be calculated before or after the compression of data before transmission. ...
redDragon's user avatar
  • 105
-1 votes
2 answers
410 views

Here are some examples of 5x5 Magic Squares found by some good solvers : Magic Square Generator by Marcel Roos this program state using 2.4GHz Intel takes about 95 hours to generate all solutions. ...
Fereydoon Shekofte's user avatar