Login

yetifoot · 05-07-2006, 10:13 PM

Looks good Antoni. I'll have to read up on radix sort, i get the rough idea though. Can't wait to see your FB one.

Antoni Gual · 05-08-2006, 01:38 AM

yetifoot: Here it is my entry, and it's slower than your code.
Can' t think of anything that would make it faster. I profiled it and it stays a 55% of the time putting data to the file. I guess my way of putting data at 255 different offsets in the file requires a lot of disk cache flushes, that makes it slow. QB1.1 version was fast because there were only 10 offsets to deal with.
Congratulations, yetifoot: unless some one else comes with more code you are the winner in the FB category! Big Grin

About radix sort I posted a litle demo of in-memory sorting at The Code Post that's faster than any implementation of quicksort i know of.

Code:
'Antoni's entry for the FB category

'QBN Forums External Sort Challenge 5 -2006

'---------------------------------------------

option explicit

'uses radix sort

CONST regsize = 4

CONST passes = 3

CONST numkeys = 255

CONST maxi = 9999999

'buffers

dim shared as uinteger ptr s1,s2

'tallies

DIM SHARED c(numkeys)  AS INTEGER

DIM SHARED d(numkeys) AS INTEGER

DIM SHARED cd(numkeys, passes) AS LONG

dim shared filerecs,cnt

'-------------------------------------------------

SUB acumbig

dim i,i1,i2,j

FOR i = 0 TO passes

i1 = 0

    FOR j = 0 TO numkeys

        i2 = cd(j, i): cd(j, i) = (i1*4)+1: i1 = i1 + i2

  NEXT

NEXT

? "A";

END SUB

'--------------------------------------------

FUNCTION getform

    GET #1,,*s1,maxi+1

        cnt+=maxi

        Function=iif (eof(1),filerecs-cnt ,maxi)

    ?"G";

END FUNCTION

'---------------------------------------------

SUB putform (digit)

 'copy 255 mem buffers to 255 offsets of temp file

dim i

 FOR i = 0 TO numkeys

    SEEK #2, cd(i, digit)

      PUT #2,,s2[d(i)],c(i) - d(i) 

       cd(i, digit)=SEEK(2)

 NEXT

 ?"P";

END SUB

'------------------------------------------------

SUB radix (digit, max)

 dim p as ubyte ptr,j,t

 p = cptr(ubyte ptr,s1) + digit

 FOR j = 0 TO max

    t = *p

    s2[c(t)] = s1[j]

    c(t) += 1

    p+=regsize

 NEXT

 ?"x";

END SUB

'--------------------------------------------------

SUB tallybig (max)

 dim p as ubyte ptr,j,k

 p=cptr(ubyte ptr,s1)

 FOR j = 0 TO max

     FOR k = 0 TO passes

         cd(*p, k) += 1

     p+=1    

     NEXT

      'p+=regsize

 NEXT

 ?"T";

END SUB

'--------------------------------------------------

SUB tallysmall (max, digit)

 dim p as ubyte ptr,t,j,i1,i2

 p=cptr(ubyte ptr,s1)+digit

 erase c

 FOR j = 0 TO max

         c(*p) +=  1

         p+=regsize

 NEXT

 i1 = 0

 FOR j = 0 TO numkeys

        i2 = c(j): c(j) = i1: d(j) = i1: i1 = i1 + i2

 NEXT

 ?"t";

END SUB

'-----------------------------------------------------

dim f1,i,t!

dim f1$,f2$

s1=allocate((maxi+1) *sizeof(integer))

s2=allocate((maxi+1) *sizeof(integer))

f1$ = "temp1.dat"

f2$ = "temp2.dat"

t! = TIMER

'data buffers

OPEN "unsortfb.dat" FOR binary  AS #1 

filerecs=lof(1)\regsize

PRINT "Sorting a file of "; filerecs; " records"

' tally the file offsets

cnt=1

do 

  tallybig (getform)

loop until eof(1)

acumbig

CLOSE

PRINT TIMER - t!

'radix sort

FOR i = 0 TO passes

  if i=0 then 

    OPEN "unsortfb.dat" FOR binary  AS #1

  else

      OPEN f1$ FOR binary ACCESS READ AS #1 

  end if

    OPEN f2$ FOR binary ACCESS WRITE AS #2 

  cnt=1

    WHILE NOT EOF(1)

        f1 = getform

        tallysmall f1, i

        radix i, f1

        putform i

    WEND

    CLOSE

    PRINT "pass "; i+1; " of 4 "; TIMER - t!

    SWAP f1$, f2$

NEXT

'format back

NAME f1$ AS "sort.dat"

kill f2$

deallocate s1

deallocate s2

PRINT : PRINT "File sorted in "; TIMER - t!; "seconds"

END

yetifoot · 05-08-2006, 08:50 PM

I'll have to check that out (The Code Post). To make the 'classic' version faster implementing a multi-merge phase that merges more than 2 files at once should do it, also fixing the buffers in the merge so that they work on the inputs, rather than the output should make it faster.

Antoni Gual · 05-09-2006, 12:20 AM

yetifoot:
Perhaps a radix sort combined with a multiphase merge keeping all input files opened during the merge would be the best, no disk buffer flushes at all...

Here is the summary of the profile of YOUR entry

Code:
(main)                                          4066.64538  (100.00%)

QUICKSORT_INT32                                 2958.71043  (72.76%)

fb_MemSwap                                      1432.66755  (35.23%)

FILES_MERGE_INT32                               1088.93185  (26.78%)

fread                                           546.10271  (13.43%)

fwrite                                          18.02109  (0.44%)

fopen                                           0.10257  (0.00%)

fb_PrintString                                  0.05647  (0.00%)

fb_FileKill                                     0.02784  (0.00%)

rename                                          0.01669  (0.00%)

free                                            0.01464  (0.00%)

fclose                                          0.00504  (0.00%)

fb_IntToStr                                     0.00063  (0.00%)

fb_PrintVoid                                    0.00059  (0.00%)

fb_ArrayRedim                                   0.00034  (0.00%)

fb_ArrayErase                                   0.00033  (0.00%)

fb_StrConcat                                    0.00028  (0.00%)

fb_StrDelete                                    0.00011  (0.00%)

fseek                                           0.00010  (0.00%)

fb_StrAssign                                    0.00006  (0.00%)

fb_StrAllocTempDescZEx                          0.00005  (0.00%)

fb_ChDir                                        0.00004  (0.00%)

ftell                                           0.00004  (0.00%)

fb_ExePath                                      0.00002  (0.00%)

malloc                                          0.00002  (0.00%)

fb_Timer                                        0.00001  (0.00%)

And this is the profile of MY entry

Code:
(main)                                          296.03773  (100.00%)

PUTFORM                                         164.57270  (55.59%)

fb_FilePut                                      162.92538  (55.04%)

GETFORM                                         103.59200  (34.99%)

fb_FileGet                                      103.57253  (34.99%)

RADIX                                           16.54779  (5.59%)

TALLYBIG                                        6.16819  (2.08%)

TALLYSMALL                                      5.07133  (1.71%)

fb_FileSeek                                     1.33148  (0.45%)

fb_FileTell                                     0.19386  (0.07%)

fb_PrintString                                  0.02475  (0.01%)

fb_PrintDouble                                  0.01593  (0.01%)

fb_FileOpen                                     0.01558  (0.01%)

fb_FileKill                                     0.01457  (0.00%)

free                                            0.01294  (0.00%)

rename                                          0.01137  (0.00%)

fb_FileClose                                    0.00199  (0.00%)

fb_PrintInt                                     0.00098  (0.00%)

fb_FileEof                                      0.00087  (0.00%)

fb_PrintVoid                                    0.00025  (0.00%)

fb_ArrayClear                                   0.00021  (0.00%)

ACUMBIG                                         0.00018  (0.00%)

fb_StrAllocTempDescZEx                          0.00016  (0.00%)

fb_Timer                                        0.00003  (0.00%)

malloc                                          0.00002  (0.00%)

fb_FileSize                                     0.00002  (0.00%)

fb_StrSwap                                      0.00001  (0.00%)

fb_StrAssign                                    0.00001  (0.00%)

I thing a radix sort would do it...

yetifoot · 05-09-2006, 12:43 PM

I never realised fb_memswap was taking so much time!, i thought i'd profiled my code. Inlining that would save a good amount of time!

However as that's in the quicksort, it would be better to try to replace with radix as you suggest.

I'll have a go at implementing the radix sort into mine, see what happens

EDIT:

I tried putting the radix sort into mine and it seemed slower, or at best the same. Partly this may be due to the fact it allocates the same amount of memory as the chunk being passed to it (therefore using swap file). I changed my blocksize to 48Mb, which stopped the swap file being used, but then the merge became slower because there were more files.

Moneo · 05-10-2006, 05:20 AM

Antoni and Yetifoot,

Would you guys please explain the basics of a QB implementation of how you did the merge passes.

I get a headache thinking about it. If I increase the number of files which hold sorted blocks, then I need to use more memory to be able to merge these blocks, which means I might have to decrease the block size, which means more I/O time. How do you figure out this delicate balance between memory space needed and disk I/O time?

Thanks.
*****

Antoni Gual · 05-10-2006, 01:07 PM

Well, in the case of my source was a radix sort in disk, no merge phase, so simply I loaded to memory as many data as i could. No delicate balance, just brute force.

Btw: I hve found this method in Donald Knuth's TAOCP, so it looks like I reinvented the wheel Big Grin

yetifoot · 05-10-2006, 04:10 PM

Quote:How do you figure out this delicate balance between memory space needed and disk I/O time?

For my program I took Antonis instruction of using only 100Mb of memory. This was because after this 100Mb barrier any memory operations would take place in swap file, therefore being much slower.

So therefore i chose a block size of 96Mb (to allow a little left over room for program variables) I chose the largest block size i could, because memory read/writes are always much faster than disc r/w.

During the merge phase it is not necessary to use memory. I read a record from each block file, compare, and dump the lowest value. The input file that had a record put to output then gets read again, and the cycle continues. I used the memory in the merge phase as purely a file buffer. All the output is put to memory until there is no more room in this buffer, then it is flushed to file. This was to avoid too much seeking on the hard disc.

Quote:Would you guys please explain the basics of a QB implementation of how you did the merge passes.

I believe that mac's entry is a QB version using merges, so reading this may help you. I will try and write a pseudo-code version of my code today to explain it better. I did comment my code quite a lot, so even though it is for FB it may still help you.

Quote:I get a headache thinking about it

Me too!

yetifoot · 05-10-2006, 05:13 PM

OK. I have written an explanation, i hope it's useful! It explains the process I used to merge two files into one, first in a verbose psuedocode, then with some generic BASIC code. This omits any buffering code and just attempts to explain the actual merging.

Code:
This file outlines a subroutine that merges two input files of sorted data to an output file.

Definitions:

InFile1, InFile2 - The input files, these will be two files that contain sorted records.  They

                   do not have to contain an equal number of records.

OutFile          - The output file.  This will be the result of merging the two input files.

nrl1, nrl2       - The number of records still to process in each input file.

temp1, temp2     - These contain the last read record from each input file.

Pseudocode:

[1]  Open InFile1 and InFile2 for reading, open OutFile for writing

[2]  Calculate nrl1, and nrl2 by obtaining the LOF (size of file) of each input, and dividing by the record size.

[3]  Read a record from each input into appropriate temp variable.

[4]  While (nrl1 <> 0) AND (nrl2 <> 0) (ie: there is still data to read in both files)

[5]    Compare temp1 to temp2

[6]    If temp1 < temp2 Then           (if the record from InFile1 compares as less than than the record from InFile2)

[7]      Put temp1 to OutFile

[8]      Decrement nrl1                (we have now put the record with the lowest value (temp1 from InFile1) to the output, 

                                        and decremented the counter for InFile1)

[9]      If nrl1 <> 0 Then Read from InFile1 to temp1 (because we have put the record from InFile1 we now need to refill

                                                       temp1)

[10]   Else                            (record from InFile1 compares as greater than or equal to the record from InFile2)

                                       (Note: in the case of equality it doesn't really matter which is put to file first)

[11]     Put temp2 to OutFile

[12]     Decrement nrl2                (we have now put the record with the lowest value (temp2 from InFile2) to the output, 

                                        and decremented the counter for InFile2)

[13]     If nrl2 <> 0 Then Read from InFile2 to temp2 (because we have put the record from InFile2 we now need to refill

                                                       temp2)

[14]   End If

[15] Wend

[16] At this point one of the input files has been exhausted.  Therefore we can safely put all the remaining records

     in the non-exausted file to the output.  (because both inputs are already sorted).  Only one of the following loops

                                               will actually be run.  We should also remember to flush temp1, or temp2

                                               depending on which still has leftover data)

[17]  While nrl1 > 0

[18]    Read a record from InFile1 and put it to OutFile

[19]    Decrement nrl1

[20]  Wend

[21]  While nrl2 > 0

[22]    Read a record from InFile2 and put it to OutFile

[23]    Decrement nrl2

[24]  Wend

[25]  Close all 3 files.

BASIC Code:

' Note integer type is assumed 4 bytes.

Sub MergeFiles(InFile1 As String, InFile2 As String, OutFile As String)

Dim nrl1 As Integer

Dim nrl2 As Integer

Dim temp1 As Integer

Dim temp2 As Integer

Dim record_size As Integer

  record_size = 4 ' Size of Integer

  Open InFile1 For Binary As #1

  Open InFile2 For Binary As #2

  Open OutFile For Binary As #3

    nrl1 = LOF(1) \ record_size

    nrl2 = LOF(2) \ record_size

    If (nrl1 = 0) OR (nrl2 = 0) Then Exit Sub ' Just to show that zero record files should not be passed to this sub.

                                              ' As they will cause failure.  This could be coded for, but we don't do

                                              ' that here.

    Get #1, , temp1

    Get #2, , temp2

    ' The main loop

    While (nrl1 <> 0) AND (nrl2 <> 0)

      If temp1 < temp2 Then

        Put #3, , temp1

        nrl1 = nrl1 - 1

        If nrl1 <> 0 Then Get #1, , temp1

      Else

        Put #3, , temp2

        nrl2 = nrl2 - 1

        If nrl2 <> 0 Then Get #2, , temp2

      End If

    Wend

    ' Only one of these actually runs.

    While nrl1 > 0

      Put #3, , temp1 ' This line appears first because there will be data left from the main loop

      nrl1 = nrl1 - 1

      If nrl1 <> 0 Then Get #1, , temp1

    Wend

    While nrl2 > 0

      Put #3, , temp2 ' This line appears first because there will be data left from the main loop

      nrl2 = nrl2 - 1

      If nrl2 <> 0 Then Get #2, , temp2

    Wend

  Close ' All

End Sub

Moneo · 05-11-2006, 04:16 AM

Quote:Well, in the case of my source was a radix sort in disk, no merge phase, so simply I loaded to memory as many data as i could. No delicate balance, just brute force.

Btw: I hve found this method in Donald Knuth's TAOCP, so it looks like I reinvented the wheel

I don't understand. The whole idea of the challenge was that the data does not fit in memory. To me, that implies doing merge passes on blocks of records which were sorted in memory. Must be something magical about a radix sort. I don't know how it works.

But, thanks for the tip, I have a copy of Knuth's TAOCP Searching and Sorting book, so I'll look it up. Funny how many things are in Knuth's books, and we look all over except there. Like, I looked all over for 10 years trying to find the algorithm for the computation of Easter. Finally, there it was in his Fundamental Algorithms book.
*****

Login
Username:
Password:	Lost Password?
	Remember me