0

What is the best way to convert a char array (containing bytes from a file) into an decimal representation so that it can be converted back later?

E.g "test" -> 18951210 -> "test".

EDITED

2
  • 18951210 is not the representation of "test" Commented Nov 8, 2011 at 21:16
  • I know i was giving it as an example of something I would want. Commented Nov 8, 2011 at 21:19

6 Answers 6

2

It can't be done without a bignum class, since there's more letter combinations possible than integer combinations in an unsigned long long. (unsigned long long will hold about 7-8 characters)

If you have some sort of bignum class:

biguint string_to_biguint(const std::string& s) {
    biguint result(0);
    for(int i=0; i<s.length(); ++i) {
        result *= UCHAR_MAX;
        result += (unsigned char)s[i];
    }
    return result;
}
std::string biguint_to_string(const biguint u) {
    std::string result;
    do {
        result.append(u % UCHAR_MAX)
        u /= UCHAR_MAX;
    } while (u>0);
    return result;
}

Note: the string to uint conversion will lose leading NULLs, and the uint to string conversion will lose trailing NULLs.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks exactly what i needed. BTW you probably forgot to change the second function to "convert_to_string".
Yes. I did the old copy/paste/forget-to-change.
1

I'm not sure what exactly you mean, but characters are stored in memory as their "representation", so you don't need to convert anything. If you still want to, you have to be more specific.

EDIT: You can

  • Try to read byte by byte shifting the result 8 bits left and oring it with the next byte.
  • Try to use mpz_inp_raw

6 Comments

Sorry didn't really explain it well. I mean to have a decimal representation of a string.
I think I almost understand. Well, you can character by character multiply your mpz by 256 (or shift 8 bits, there's something like _mul_2exp or whatever) and add the next char?
I can still use mpz_t so I'll try it out. You can edit your answer and I'll accept ;)
Althernatively, it's likely that mpz_inp_raw from the file will do what you want.
I'll use mpz_inp_raw for now, but Mooing Duck's will be better for later. Thanks anyway ;)
|
0

You can use a tree similar to Huffman compression algorithm, and then represent the path in the tree as numbers.

You'll have to keep the dictionary somewhere, but you can just create a constant dictionary that covers the whole ASCII table, since the compression is not the goal here.

Comments

0

There is no conversion needed. You can just use pointers.

Example:

char array[4 * NUMBER];
int *pointer;

Keep in mind that the "length" of pointer is NUMBER.

Comments

0

As mentioned, character strings are already ranges of bytes (and hence easily rendered as decimal numbers) to start with. Number your bytes from 000 to 255 and string them together and you've got a decimal number, for whatever that is worth. It would help if you explained exactly why you would want to be using decimal numbers, specifically, as hex would be easier.

If you care about compression of the underlying arrays forming these numbers for Unicode Strings, you might be interested in:

http://en.wikipedia.org/wiki/Standard_Compression_Scheme_for_Unicode

If you want some benefits of compression but still want fast random-access reads and writes within a "packed" number, you might find my "NSTATE" library to be interesting:

http://hostilefork.com/nstate/

For instance, if you just wanted a representation that only acommodated 26 english letters...you could store "test" in:

NstateArray<26> myString (4);

You could read and write the letters without going through a compression or decompression process, in a smaller range of numbers than a conventional string. Works with any radix.

Comments

0

Assuming you want to store the integers(I'm reading as ascii codes) in a string. This will add the leading zeros you will need to get it back into original string. character is a byte with a max value of 255 so it will need three digits in numeric form. It can be done without STL fairly easily too. But why not use tools you have?

#include <iostream>
#include <sstream>

using namespace std;

char array[] = "test";

int main()
{

   stringstream out;
   string s=array;   

   out.fill('0');
   out.width(3);
   for (int i = 0; i < s.size(); ++i)
   {
      out << (int)s[i];
   }
   cout << s << " -> " << out.str();
   return 0; 
}

output:

test -> 116101115116

Added:

change line to

out << (int)s[i] << ",";

output

test -> 116,101,115,116,

2 Comments

This is just appending each char's value to a string, so I can't know whether this is 116,101,115,116 or 11,61,011,15,116...
Each ascii code for a character is exactly 3 digits long (padded with zeros), can easily add a comma if you want. out << (int)s[i] << ","; and get rid of out.width(3) if you dont want them all to be 3. Most readable characters will be 3 anyways. will take less space than commas to just assume all will be three.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.