2

I am working on a project which contains two servers, one is written in python, the other in C. To maximize the capacity of the servers, we defined a binary proprietary protocol by which these two could talk to each other.

The protocol is defined in a C header file in the form of C struct. Usually, I would use VIM to do some substitutions to convert this file into Python code. But that means I have to manually do this every time the protocol is modified.

So, I believe a parser that could parse C header file would be a better choice. However, there are at least a dozen of Python parser generator. So I don't which one is more suitable for my particular task.

Any suggestion? Thanks a lot.


EDIT:

Of course I am ask anyone to write me the code....

The code is already finished. I converted the header file into Python code in the form that construct, a python library which could parse the binary data, could recognize.

I am also not looking for some already exist C parser. I am asking this question because a book I am reading talks a little about parser generator inspired me to learn how to use a real parser generator.


EDIT Again:

When we make the design of the system, I suggested to use Google Protocol Buffer, ZeroC ICE, or whatever multi-language network programming middleware to eliminate the task of implementing a protocol.

However, not every programmer could read English documents and would like to try new things, especially when they have plenty of experience of doing it in old and simple but a little clumsy way.

4
  • So the question is: (1) recommend a python parser generator to work on C struct definitions and (2) write me the code? Just trying to clarify what you want. Commented Oct 19, 2010 at 14:24
  • possible duplicate of Extract the fields of a C struct Commented Oct 19, 2010 at 14:27
  • 1
    note that that's exactly the use case that prompted Google to create protocol buffers (code.google.com/p/protobuf), which had from the start Python, Java and C++ generators. i think now there are C bindings too. Commented Oct 19, 2010 at 14:30
  • I doubt that parsing binary formats in python would "maximize the capacity" Commented Oct 19, 2010 at 15:23

4 Answers 4

3

As an alternative solution that might feel a bit over-ambitious from the beginning, but also might serve you very well in the long-term, is:

  • Redefine the protocol in some higher-level language, for instance some custom XML
  • Generate both the C struct definitions and any required Python versions from the same source.
Sign up to request clarification or add additional context in comments.

1 Comment

that's exactly what protocol buffers do
1

I would personally use PLY:

http://www.dabeaz.com/ply/

And there is already a C parser written with PLY:

http://code.google.com/p/pycparser/

Comments

1

If I were doing this, I would use IDL as the structure definition language. The main problem you will have with doing C structs is that C has pointers, particularly char* for strings. Using IDL restricts the data types and imposes some semantics.

Then you can do whatever you want. Most parser generators are going to have IDL as a sample grammar.

Comments

0

A C struct is unlikely to be portable enough to be sending between machines. Different endian, different word-sizes, different compilers will all change the way the structure is mapped to bytes.

It would be better to use a properly portable binary format that is designed for communications.

2 Comments

My colleagues are well trained to write code to handle endian translation, word size padding, etc.
Then the endian-defined padding-defined binary format is what you should be coding against. And the C-struct is just a temporary representation of that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.