1
no  time  scores
1    10    123
2    11    22
3    12    22
4    50    55
5    60    22
6    70    66
.    .     .
.    .     .
n    n     n 

Above a the content of my txt file (thousand of lines).

1st column - number of samples
2nd column - time (from beginning to end ->accumulated)
3rd column - scores

I wanted to create a new file which will be the total of every three sample of the scores divided by the time difference of the same sample.

e.g. (123+22+22)/ (12-10) = 167/2 = 83.5
     (55+22+66)/(70-50) = 143/20 = 7.15

new txt file

83.5
7.15
.
.
.
n

so far I have this code:

fid=fopen('data.txt')
data = textscan(fid,'%*d %d %d')
time = (data{1})
score= (data{2})
for sample=1:length(score)
     ..... // I'm stucked here ..
end
....
2
  • Can you guarantee n is a multiple of 3? Commented Jun 2, 2010 at 13:20
  • no. if the remaining is less than 3, e.g. 1 or 2 samples, it will just leave it. Commented Jun 2, 2010 at 13:32

4 Answers 4

7

If you are feeling adventurous, here's a vectorized one-line solution using ACCUMARRAY (assuming you already read the file in a matrix variable data like the others have shown):

NUM = 3;
result = accumarray(reshape(repmat(1:size(data,1)/NUM,NUM,1),[],1),data(:,3)) ...
    ./ (data(NUM:NUM:end,2)-data(1:NUM:end,2))

Note that here the number of samples NUM=3 is a parameter and can be substituted by any other value.

Also, reading your comment above, if the number of samples is not a multiple of this number (3), then simply discard the remaining samples by doing this beforehand:

data = data(1:fix(size(data,1)/NUM)*NUM,:);

Im sorry, here's a much simpler one :P

result  = sum(reshape(data(:,3), NUM, []))' ./ (data(NUM:NUM:end,2)-data(1:NUM:end,2));
Sign up to request clarification or add additional context in comments.

1 Comment

+1: I've never come across accumarray before, much neater solution than mine. Thanks @Amro.
2
%# Easier to load with importdata
data = importdata('data.txt',' ',1);
%# Get the number of rows
n = size(data,1);
%# Column IDs
time = 2;score = 3;
%# The interval size (3 in your example)
interval = 3;
%# Pre-allocate space
new_data = zeros(numel(interval:interval:n),1);
%# For each new element in the new data
index = 1;
%# This will ignore elements past the closest (floor) multiple of 3 as requested
for i = interval:interval:n
    %# First and last elements in a batch
    a = i-interval+1;
    b = i;
    %# Compute the new data
    new_data(index) = sum( data(a:b,score) )/(data(b,time)-data(a,time));
    %# Increment
    index = index+1;
end

13 Comments

I wonder what is the different between importdata and textscan?
In textscan you need to specify the formatting, importdata figures it out (most of the time).
I wonder if the "data" in the file.data ..refer to the name of the file? data.txt?
I got this error when implement the code ..Attempt to reference of non-structure array..:(
No, the data is the data component after importdata. Can you just type file = importdata('data.txt');disp(file) and tell me what the output it?
|
0

For what it's worth, here is how you would go about to do that in Python. It is probably adaptable to Matlab.

import numpy
no, time, scores = numpy.loadtxt('data', skiprows=1).T

 # here I assume that your n is a multiple of 3! otherwise you have to adjust
sums = scores[::3]+scores[1::3]+scores[2::3]
dt = time[2::3]-time[::3]

result = sums/dt

Comments

0

I suggest you use the importdata() function to get your data into your variable called data. Something like this:

data = importdata('data.txt',' ', 1)

replace ' ' by the delimiter your file uses, the 1 specifies that Matlab should ignore 1 header line. Then, to compute your results, try this statement:

(data(1:3:end,3)+data(2:3:end,3)+data(3:3:end,3))./(data(3:3:end,2)-data(1:3:end,2))

This worked on your sample data, should work on the real data you have. If you figure it out yourself you'll learn some useful Matlab.

Then use save() to write the results back to a file.

PS If you find yourself writing loops in Matlab you are probably doing something wrong.

3 Comments

@Mark: Loops aren't bad always. This has been discussed before in SO. Plus, this solution fixes the interval at 3, ( data(1:3:end,3) + data(2:3:end) + ... ).
@Jessy -- how big is your dataset ?
thousand of lines --- 318687 lines

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.