Обычный программист из Марбурга ускорил apt-get update более, чем в 10 раз!
The results are impressive:
For APT 1.1.6, updating with PDiffs enabled took 41 seconds.
For APT 1.1.7, updating with PDiffs enabled took 4 seconds.
Автору пришлось поломать голову, прежде чем появился примерный план архитектуры. Была проведена серия сложных тестов, выявляющая скрытые недостатки текущей реализации. Также при подготовке патча были применены новейшие техники и подходы, что несомненно говорит о высоком уровне профессионализма в команде.
The reason for this is that our I/O is unbuffered, and we were reading one byte at a time in order to read lines. This changed on December 24, by adding read buffering for reading lines, vastly improving the performance of rred.
But it was still slow, so today I profiled – using gperftools – the rred method running on a 430MB uncompressed Contents file with a 75 KB large patch. I noticed that our ReadLine() method was calling some method which took a long time (google-pprof told me it was some _nss method, but that was wrong [thank you, addr2line]).
After some further look into the code, I noticed that we set the length of the buffer using the length of the line. And whenever we moved some data out of the buffer, we called memmove() to move the remaining data to the front of the buffer.
So, I tried to use a fixed buffer size of 4096 (commit). Now memmove() would spend less time moving memory around inside the buffer. This helped a lot, bringing the run time on my example file down from 46 seconds to about 2 seconds.
Later on, I rewrote the code to not use memmove() at all – opting for start and end variables instead; and increasing the start variable when reading from the buffer (commit).
This in turn further improved things, bringing it down to about 1.6 seconds. We could now increase the buffer size again, without any negative effect.
А ты сиди дальше ***чи свой computer science!