Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
External sort compo?
Looks good Antoni. I'll have to read up on radix sort, i get the rough idea though. Can't wait to see your FB one.
[Image: chav.gif]
yetifoot: Here it is my entry, and it's slower than your code.
Can' t think of anything that would make it faster. I profiled it and it stays a 55% of the time putting data to the file. I guess my way of putting data at 255 different offsets in the file requires a lot of disk cache flushes, that makes it slow. QB1.1 version was fast because there were only 10 offsets to deal with.
Congratulations, yetifoot: unless some one else comes with more code you are the winner in the FB category! Big Grin

About radix sort I posted a litle demo of in-memory sorting at The Code Post that's faster than any implementation of quicksort i know of.

'Antoni's entry for the FB category
'QBN Forums External Sort Challenge 5 -2006
option explicit

'uses radix sort
CONST regsize = 4
CONST passes = 3
CONST numkeys = 255
CONST maxi = 9999999

dim shared as uinteger ptr s1,s2
DIM SHARED cd(numkeys, passes) AS LONG

dim shared filerecs,cnt

SUB acumbig
dim i,i1,i2,j
FOR i = 0 TO passes
i1 = 0
    FOR j = 0 TO numkeys
        i2 = cd(j, i): cd(j, i) = (i1*4)+1: i1 = i1 + i2
? "A";

FUNCTION getform
    GET #1,,*s1,maxi+1
        Function=iif (eof(1),filerecs-cnt ,maxi)

SUB putform (digit)
'copy 255 mem buffers to 255 offsets of temp file
dim i
FOR i = 0 TO numkeys
    SEEK #2, cd(i, digit)
      PUT #2,,s2[d(i)],c(i) - d(i)
      cd(i, digit)=SEEK(2)

SUB radix (digit, max)
dim p as ubyte ptr,j,t
p = cptr(ubyte ptr,s1) + digit
FOR j = 0 TO max
    t = *p
    s2[c(t)] = s1[j]
    c(t) += 1

SUB tallybig (max)
dim p as ubyte ptr,j,k
p=cptr(ubyte ptr,s1)
FOR j = 0 TO max
     FOR k = 0 TO passes
         cd(*p, k) += 1

SUB tallysmall (max, digit)
dim p as ubyte ptr,t,j,i1,i2
p=cptr(ubyte ptr,s1)+digit
erase c
FOR j = 0 TO max
         c(*p) +=  1
i1 = 0
FOR j = 0 TO numkeys
        i2 = c(j): c(j) = i1: d(j) = i1: i1 = i1 + i2


dim f1,i,t!
dim f1$,f2$

s1=allocate((maxi+1) *sizeof(integer))
s2=allocate((maxi+1) *sizeof(integer))

f1$ = "temp1.dat"
f2$ = "temp2.dat"
t! = TIMER
'data buffers

OPEN "unsortfb.dat" FOR binary  AS #1
PRINT "Sorting a file of "; filerecs; " records"

' tally the file offsets
  tallybig (getform)
loop until eof(1)

'radix sort
FOR i = 0 TO passes
  if i=0 then
    OPEN "unsortfb.dat" FOR binary  AS #1
      OPEN f1$ FOR binary ACCESS READ AS #1
  end if
    OPEN f2$ FOR binary ACCESS WRITE AS #2
        f1 = getform
        tallysmall f1, i
        radix i, f1
        putform i
    PRINT "pass "; i+1; " of 4 "; TIMER - t!
    SWAP f1$, f2$

'format back
NAME f1$ AS "sort.dat"
kill f2$
deallocate s1
deallocate s2

PRINT : PRINT "File sorted in "; TIMER - t!; "seconds"
I'll have to check that out (The Code Post). To make the 'classic' version faster implementing a multi-merge phase that merges more than 2 files at once should do it, also fixing the buffers in the merge so that they work on the inputs, rather than the output should make it faster.
[Image: chav.gif]
Perhaps a radix sort combined with a multiphase merge keeping all input files opened during the merge would be the best, no disk buffer flushes at all...

Here is the summary of the profile of YOUR entry
(main)                                          4066.64538  (100.00%)
QUICKSORT_INT32                                 2958.71043  (72.76%)
fb_MemSwap                                      1432.66755  (35.23%)
FILES_MERGE_INT32                               1088.93185  (26.78%)
fread                                           546.10271  (13.43%)
fwrite                                          18.02109  (0.44%)
fopen                                           0.10257  (0.00%)
fb_PrintString                                  0.05647  (0.00%)
fb_FileKill                                     0.02784  (0.00%)
rename                                          0.01669  (0.00%)
free                                            0.01464  (0.00%)
fclose                                          0.00504  (0.00%)
fb_IntToStr                                     0.00063  (0.00%)
fb_PrintVoid                                    0.00059  (0.00%)
fb_ArrayRedim                                   0.00034  (0.00%)
fb_ArrayErase                                   0.00033  (0.00%)
fb_StrConcat                                    0.00028  (0.00%)
fb_StrDelete                                    0.00011  (0.00%)
fseek                                           0.00010  (0.00%)
fb_StrAssign                                    0.00006  (0.00%)
fb_StrAllocTempDescZEx                          0.00005  (0.00%)
fb_ChDir                                        0.00004  (0.00%)
ftell                                           0.00004  (0.00%)
fb_ExePath                                      0.00002  (0.00%)
malloc                                          0.00002  (0.00%)
fb_Timer                                        0.00001  (0.00%)

And this is the profile of MY entry
(main)                                          296.03773  (100.00%)
PUTFORM                                         164.57270  (55.59%)
fb_FilePut                                      162.92538  (55.04%)
GETFORM                                         103.59200  (34.99%)
fb_FileGet                                      103.57253  (34.99%)
RADIX                                           16.54779  (5.59%)
TALLYBIG                                        6.16819  (2.08%)
TALLYSMALL                                      5.07133  (1.71%)
fb_FileSeek                                     1.33148  (0.45%)
fb_FileTell                                     0.19386  (0.07%)
fb_PrintString                                  0.02475  (0.01%)
fb_PrintDouble                                  0.01593  (0.01%)
fb_FileOpen                                     0.01558  (0.01%)
fb_FileKill                                     0.01457  (0.00%)
free                                            0.01294  (0.00%)
rename                                          0.01137  (0.00%)
fb_FileClose                                    0.00199  (0.00%)
fb_PrintInt                                     0.00098  (0.00%)
fb_FileEof                                      0.00087  (0.00%)
fb_PrintVoid                                    0.00025  (0.00%)
fb_ArrayClear                                   0.00021  (0.00%)
ACUMBIG                                         0.00018  (0.00%)
fb_StrAllocTempDescZEx                          0.00016  (0.00%)
fb_Timer                                        0.00003  (0.00%)
malloc                                          0.00002  (0.00%)
fb_FileSize                                     0.00002  (0.00%)
fb_StrSwap                                      0.00001  (0.00%)
fb_StrAssign                                    0.00001  (0.00%)

I thing a radix sort would do it...
I never realised fb_memswap was taking so much time!, i thought i'd profiled my code. Inlining that would save a good amount of time!

However as that's in the quicksort, it would be better to try to replace with radix as you suggest.

I'll have a go at implementing the radix sort into mine, see what happens


I tried putting the radix sort into mine and it seemed slower, or at best the same. Partly this may be due to the fact it allocates the same amount of memory as the chunk being passed to it (therefore using swap file). I changed my blocksize to 48Mb, which stopped the swap file being used, but then the merge became slower because there were more files.
[Image: chav.gif]
Antoni and Yetifoot,

Would you guys please explain the basics of a QB implementation of how you did the merge passes.

I get a headache thinking about it. If I increase the number of files which hold sorted blocks, then I need to use more memory to be able to merge these blocks, which means I might have to decrease the block size, which means more I/O time. How do you figure out this delicate balance between memory space needed and disk I/O time?

Well, in the case of my source was a radix sort in disk, no merge phase, so simply I loaded to memory as many data as i could. No delicate balance, just brute force.

Btw: I hve found this method in Donald Knuth's TAOCP, so it looks like I reinvented the wheel Big Grin
Quote:How do you figure out this delicate balance between memory space needed and disk I/O time?

For my program I took Antonis instruction of using only 100Mb of memory. This was because after this 100Mb barrier any memory operations would take place in swap file, therefore being much slower.

So therefore i chose a block size of 96Mb (to allow a little left over room for program variables) I chose the largest block size i could, because memory read/writes are always much faster than disc r/w.

During the merge phase it is not necessary to use memory. I read a record from each block file, compare, and dump the lowest value. The input file that had a record put to output then gets read again, and the cycle continues. I used the memory in the merge phase as purely a file buffer. All the output is put to memory until there is no more room in this buffer, then it is flushed to file. This was to avoid too much seeking on the hard disc.

Quote:Would you guys please explain the basics of a QB implementation of how you did the merge passes.

I believe that mac's entry is a QB version using merges, so reading this may help you. I will try and write a pseudo-code version of my code today to explain it better. I did comment my code quite a lot, so even though it is for FB it may still help you.

Quote:I get a headache thinking about it

Me too!
[Image: chav.gif]
OK. I have written an explanation, i hope it's useful! It explains the process I used to merge two files into one, first in a verbose psuedocode, then with some generic BASIC code. This omits any buffering code and just attempts to explain the actual merging.

This file outlines a subroutine that merges two input files of sorted data to an output file.


InFile1, InFile2 - The input files, these will be two files that contain sorted records.  They
                   do not have to contain an equal number of records.
OutFile          - The output file.  This will be the result of merging the two input files.
nrl1, nrl2       - The number of records still to process in each input file.
temp1, temp2     - These contain the last read record from each input file.


[1]  Open InFile1 and InFile2 for reading, open OutFile for writing
[2]  Calculate nrl1, and nrl2 by obtaining the LOF (size of file) of each input, and dividing by the record size.
[3]  Read a record from each input into appropriate temp variable.

[4]  While (nrl1 <> 0) AND (nrl2 <> 0) (ie: there is still data to read in both files)
[5]    Compare temp1 to temp2
[6]    If temp1 < temp2 Then           (if the record from InFile1 compares as less than than the record from InFile2)
[7]      Put temp1 to OutFile
[8]      Decrement nrl1                (we have now put the record with the lowest value (temp1 from InFile1) to the output,
                                        and decremented the counter for InFile1)
[9]      If nrl1 <> 0 Then Read from InFile1 to temp1 (because we have put the record from InFile1 we now need to refill
[10]   Else                            (record from InFile1 compares as greater than or equal to the record from InFile2)
                                       (Note: in the case of equality it doesn't really matter which is put to file first)
[11]     Put temp2 to OutFile
[12]     Decrement nrl2                (we have now put the record with the lowest value (temp2 from InFile2) to the output,
                                        and decremented the counter for InFile2)
[13]     If nrl2 <> 0 Then Read from InFile2 to temp2 (because we have put the record from InFile2 we now need to refill
[14]   End If
[15] Wend

[16] At this point one of the input files has been exhausted.  Therefore we can safely put all the remaining records
     in the non-exausted file to the output.  (because both inputs are already sorted).  Only one of the following loops
                                               will actually be run.  We should also remember to flush temp1, or temp2
                                               depending on which still has leftover data)

[17]  While nrl1 > 0
[18]    Read a record from InFile1 and put it to OutFile
[19]    Decrement nrl1
[20]  Wend

[21]  While nrl2 > 0
[22]    Read a record from InFile2 and put it to OutFile
[23]    Decrement nrl2
[24]  Wend

[25]  Close all 3 files.


' Note integer type is assumed 4 bytes.
Sub MergeFiles(InFile1 As String, InFile2 As String, OutFile As String)
Dim nrl1 As Integer
Dim nrl2 As Integer
Dim temp1 As Integer
Dim temp2 As Integer
Dim record_size As Integer
  record_size = 4 ' Size of Integer
  Open InFile1 For Binary As #1
  Open InFile2 For Binary As #2
  Open OutFile For Binary As #3

    nrl1 = LOF(1) \ record_size
    nrl2 = LOF(2) \ record_size
    If (nrl1 = 0) OR (nrl2 = 0) Then Exit Sub ' Just to show that zero record files should not be passed to this sub.
                                              ' As they will cause failure.  This could be coded for, but we don't do
                                              ' that here.

    Get #1, , temp1
    Get #2, , temp2

    ' The main loop

    While (nrl1 <> 0) AND (nrl2 <> 0)
      If temp1 < temp2 Then
        Put #3, , temp1
        nrl1 = nrl1 - 1
        If nrl1 <> 0 Then Get #1, , temp1
        Put #3, , temp2
        nrl2 = nrl2 - 1
        If nrl2 <> 0 Then Get #2, , temp2
      End If
    ' Only one of these actually runs.

    While nrl1 > 0
      Put #3, , temp1 ' This line appears first because there will be data left from the main loop
      nrl1 = nrl1 - 1
      If nrl1 <> 0 Then Get #1, , temp1

    While nrl2 > 0
      Put #3, , temp2 ' This line appears first because there will be data left from the main loop
      nrl2 = nrl2 - 1
      If nrl2 <> 0 Then Get #2, , temp2

  Close ' All
End Sub
[Image: chav.gif]
Quote:Well, in the case of my source was a radix sort in disk, no merge phase, so simply I loaded to memory as many data as i could. No delicate balance, just brute force.

Btw: I hve found this method in Donald Knuth's TAOCP, so it looks like I reinvented the wheel Big Grin
I don't understand. The whole idea of the challenge was that the data does not fit in memory. To me, that implies doing merge passes on blocks of records which were sorted in memory. Must be something magical about a radix sort. I don't know how it works.

But, thanks for the tip, I have a copy of Knuth's TAOCP Searching and Sorting book, so I'll look it up. Funny how many things are in Knuth's books, and we look all over except there. Like, I looked all over for 10 years trying to find the algorithm for the computation of Easter. Finally, there it was in his Fundamental Algorithms book.

Forum Jump:

Users browsing this thread: 1 Guest(s)