Contextualisation
I am writing a program that is able to read data from a sensor and then do something with it. Currently I want it to be sent to a server. I have two processes that communicate through sockets, one that reads the data and stores it to a temporary file, and other that reads the temporary file, sends the data to the server.
Problem
The problem has actually never presented itself in the testing, however I have realised that it is highly possible that if the sampling frequency is high both processes coincide in trying to read/write to the file at the same time (not that they request it exactly at the same time, but that one tries to open it before the other has closed it).
Even if this does not raise an error (for what I read online, some OS do not put locks into the file) it may cause huge version incompatibility errors, leading to lost pieces of data. For this reason, this way of handling the data does not look very appropriate.
My own idea/approach
I thought to use a file-like object in memory (data buffer). I have no experience with this concept in Python, so I have researched a bit and I understand that [a buffer] is like a file that is kept in memory while the program is executing and that has very similar properties to that of a standard system file. I thought it might be a good idea to use it, however I could not find a solution to some of this inconveniences:
Since it's still like a file (file-like object), could it not be the case that if the two processes coincide in their opeartions on the object, version incompatibility errors/bugs could raise? I only need to append data with one process (at the end) and remove data from the beginning with the other (as some sort of a queue). Does this Python functionality permit this, and if so, which methods may I exactly look into in the docs?
For the explanation above, I thought about literally using queues; however this might be unefficient execution time-wise (appending to a list is ratherfast, but appending to a pandas object is around 1000 times slower according to a test I did in my own machine to see which object type would fit best). Is there an object, if not a file-like one, that lets me do this and is efficient? I know efficiency is subjective, so let's say 100 appends per second with no noticeable lag (timestamps are important in this case).
Since I am using two different processes and these do not share memory in Python, is it still possible to point to the same memory address while operating on the file-like object? I communicate them with sockets as I said, but that method is afaik call-by-value, not reference; so this looks like a serious problem to me (maybe it is necessary to merge them into two threads instead of different python processes?)
May you comment asking for any other detail if needed, I will be very happy to answer.
Edits: questions asked in comments:
How are you creating these processes? Through a Python module like
multiprocessingorsubprocess, or some other way?
I am running them as two completely separate programs. Each has a different main python file that is called by a shell script; however, I am flexible to changing this behaviour if needed.
On the other hand, the process that reads the data from the sensors has two threads: one that literally reads the data, and other that listens to sockets requests.
what type of data are you getting from the sensor and sending to the server?
I am sending tables that contain floats, generally, however sensors may also produce video stream or other sort of data structures.
Misconception of Queue | pandas
I know a queue has nothing to do with a dataframe; I am just saying I tried to use a dataframe and it didn't perform well because it's thought to pre-allocate the memory space it needs (if I'm right). I am just expressing my concerns in the performance of the solution.
multiprocessingorsubprocess, or some other way?readorwriteand line-by-line iteration), regardless of the underlying mechanism behind the object. It could be an in-memory storage likeio.StringIO, or it could be an ordinary file, or it could be a pipe or a wrapper around a socket or plenty of other things.