Tip: in Matlab, whenever you’re having speed issues, the usual solution is to reduce the number of statements within your loops. You’ll usually see a dramatic improvement.
Normally I don’t post much about work, but I just had a great moment and had to write about it. It all started while I was making a script to merge data from two different formats and remove the overlapping datapoints. It was working, but was way too slow – it took over two minutes to complete, and you know how much I hate waiting.
So like I normally do when I want to speed up my code, I opened up the profile viewer tool. It told me that waitbar and datevec were taking over 50 seconds each, which I found to be totally unacceptable.
Instead of using datevec line by line, I could use it only once on the whole vector, then reference full columns of the output. That would do everything all at once, for all 600k entries. Speed is the whole point of using Matlab in the first place, why not take advantage of it?
Imagine how happy you would be if you found out you not only gained the ability to brush your teeth in 1/100th the time, but could also schedule a robot to do it in your place – that’s how I feel when I fix stuff like this.
My code went from this:
for i = 1:n
h = waitbar(0);
x = datevec(TS.datenum(i));
X.year(i) = uint16(x(1));
X.month(i) = uint8(x(2));
X.day(i) = uint8(x(3));
totalvolume = totalvolume + TS.volume(i);
X.totalvolume(i) = totalvolume;
X.volume = TS.volume(i);
end;
Run time: 157 seconds
To this:
a = datevec(TS.datenum);
% datevec insta-runs for 600k entries!
X.year = uint16(a(:,1));
X.month = uint8(a(:,2));
X.day = uint8(a(:,3));for i = 1:n
totalvolume = totalvolume + TS.volume(i);
X.totalvolume(i) = totalvolume;
end;
Run time: 1.331 seconds
It’s finding the dramatic improvements like these that make me like my job.