2

Is there a way to embed a (per translation unit) string into an optimized binary?

In our development environment, it is fairly simple to produce a build. It happens that customers sometimes end up with a custom build, and when they experience a problem with that build, it's difficult to figure out what code base that binary came from.

I'm attempting to find a reasonable strategy to be able to reconstruct the build environment when only a forensic analysis of the binaries is possible.

Embed version info into each translation unit

One idea I had was to embed version info into each cpp file. Something like

static const char *__FILEVERSIONINFO = __FILE__ "___$Revision$___Build:" PRODUCT_VER_STR;

and then add svn:keywords property to the file so that when the build was made, I could at least know what SVN revision the file was at. This wouldn't tell me if it had been modified from that revision, but it would at least enable me to get close.

The down side of this is that I cannot get these symbols into a Release build. Because they're not referenced, they get optimized away (and I want to keep that optimization as a global setting). I attempted to Force Symbol References (__FILEVERSIONINFO as the value for "Force Symbol References" in the linker settings), but the linker complained that it couldn't find the symbol.

error LNK2001: unresolved external symbol __FILEVERSIONINFO

dumpbin listed the symbol in the object file as

09E 00000004 SECT5  notype       Static       | ___FILEVERSIONINFO

but using this name in "Force Symbol References" didn't work either.

Microsoft's documentation doesn't actually give any examples of its use, and my google-fu didn't turn up anything pertinent.

I tried using a #pragma to force its export,

__pragma(comment (linker, "/export:__FILEVERSIONINFO"))

but got the same error from the linker. How do I specify the name of this symbol?

It would be convenient if I didn't have to randomize/uniquify the name of the symbol for each translation unit, but I don't know of the /INCLUDE argument to the linker is able to include all symbols with the same name.

I don't actually care if the symbol itself is accessible, I was just wanting to be able to suck the strings out of a binary and grep for "Revision" to see what revision the binary files were at/near at the time of the build.

Embed a report into the binary

It seems possible to construct a report (e.g. svn diff) and embed some form of that into the binary. This would have the advantage of encapsulating any uncommitted, source-level changes made to the files/project).

This seems a little tedious, but it could be done.

Other solutions?

Changing how the developers/support people do builds is not a feasible option for us. I'm looking for something I can do that will not depend on them reading more documentation and following procedural rules. I realize this is not ideal, but adding more human-driven policies just isn't worth the cost.

For all practical purposes that I forsee, I'm probably only going to use this information to look for problems in a specific revision of a source file, not actually attempt to patch a custom build.

Are there better ways to tag/mark binaries that will enable me to locate a close revision of the source file(s) in question for examination?

6
  • softwareengineering.stackexchange.com is probably a better fit for this question. Commented Jul 26, 2017 at 15:12
  • 1
    Forcing the inclusion of symbols is the answer I really want, whether or not it's a great software engineering solution. I'm open to other ideas, but that's only an expression of humility, not my primary focus. Commented Jul 26, 2017 at 15:20
  • @RSahu when referring other sites, it is often helpful to point that cross-posting is frowned upon Commented Jul 26, 2017 at 15:26
  • 1
    Why static? That has internal linkage. Commented Jul 26, 2017 at 15:33
  • @KlitosKyriacou I didn't want the symbol name to have to be globally unique, so I thought making it static would allow multiple translation units to have the same symbol with different values. Commented Jul 26, 2017 at 15:35

2 Answers 2

3

The problem is the keyword static. That gives the variable internal linkage, which means “This will only be used in this file; feel free to not include it in the compiled object file or otherwise do optimizations to it”. If you remove that keyword, the compiler will include it in the object file, and you can use Force Symbol Reference to keep it.

EDIT: Incorrect reference to extern, which is for declarations, not definitions.

EDIT 2: Another way to force the symbol reference, which works for other compilers and might be more useful, is to actually reference it. Add a command line flag --build-info which prints the string and exits, or print it on startup if you have the verbose flag set, or something like that. Then you don’t need to inspect the binary and you know the string won’t be optimized away.

Sign up to request clarification or add additional context in comments.

2 Comments

When I don't declare it static, I still get linker errors attempting to export/include the name. I don't understand why internal linkage would would suggest that the compiler should leave it out of the object file, though. That would be like declaring something as "completely useless and unneccessary."
Internal linkage means that nobody else will use it. Then you don’t use it. If you don’t use something, and nobody else uses it, it is unused and is just adding bloat to the object file. In this case, that’s what you want, but usually it can just be optimized out. I can’t reproduce the linker error you mention; can you give more details about it?
1

What about SVN tags? You could tag each build..


UPDATE: Maybe do something like that:

#ifndef SRC_ID
  #ifdef __GNUC__
    #define SRC_ID(X, Y) static const char* const __attribute__((used)) X = Y
  #else
    #define SRC_ID(X, Y) static const char* const X = Y
  #endif
#endif

SRC_ID(build_id, "SOMETHING TO IDENTIFY YOUR BUILD");

The __attribute__((used)) variable attribute is used to signal the linker that this variable is used even if it appears to be unused:

This attribute, attached to a variable with static storage, means that the variable must be emitted even if it appears that the variable is not referenced.

Source: gnu.org

4 Comments

Our builds are not all "official" builds. Anyone can invoke the build script multiple times a day, and my goal is to include enough info in "unofficial" builds to go back to a set of close-enough source files.
__attribute__((used)) doesn’t say that the attribute it is used (it is the attribute); it says the variable is used. But GCC provides almost enough information by itself anyway.
Does MSVC use __attribute__ like gcc? I'm stuck with MSVC in this case.
I haven't found anything like it for MSVC in the last 20 minutes.. :(

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.