1

Background:

I am using Clojure, on Java 7.0 on Ubuntu Linux 12.10, with an Ext4 file system.

Problem:

I have an arbitrary Clojure string. I would like to encode it into a valid filename.

Question:

What is the optimal way / what is a good builtin for doing this?

Note:

The encoded file name does not have to be human readable. I just need to be able to recover the original string from the filename.

EDIT:

Although if strings that are valid names gets mapped to something that is human readbale (and close to it's original value, that would be nice too.)

Thanks!

EDIT:

encode: takes arbitrary string as input; creates valid filename as output

decode: takes file name from encode, recovers original string

2
  • Your question is incomplete... We don't know what "encode" means in this context. Commented Dec 8, 2012 at 6:46
  • @RobertHarvey: Thanks. Added an edit. Is it clear now? Commented Dec 8, 2012 at 6:48

1 Answer 1

1

If it doesn't need to be human readable, just base64 encode it. That will remove any file name invalid characters from the string.

http://richhickey.github.com/clojure-contrib/base64-api.html

If they still don't have a decoder in native clojure, use a Java base64 decode function.

Sign up to request clarification or add additional context in comments.

4 Comments

That wouldn't quite work because base64 includes '/', so you would generate invalid filenames. You'd have to replace '/' by another character outside the base64 alphabet.
Then you'll need a Java function that can do base62 encoding, like this one.
lol, should I just drop down to base 16, and use the builtin functions for represeting numbers in hex? :-)
You mean encode the string as two hex digits per byte? In most file systems there is a limit on how many characters you can use in a file name, and that will double the size of your string, whereas base62 should only increase the size slightly for ordinary text.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.