Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
External sort compo?
#31
Yes , in the case of the test files, the stability is not an issue because there are no more fields in the record(thanks, Moneo) , and no other order than the sorting field's exists.

BTW: My "great" sorting method seems to require double time than yetifoot's implementation of the classical method. I'll post it when I finish debugging it. Now it adds a single record to the file, after 80 operations, go figure where it comes from..
Antoni
Reply
#32
Quote:For an unstable sort, we would have to sort on 2 keys, the zipcode as major and the name as minor --- more time.

Thanks! I focused on the single element sort and completely forgot about possible other elements.

Mac
Reply
#33
Quote:BTW: My "great" sorting method seems to require double time [...] go figure where it comes from..

Microsoft? lol
Reply
#34
Better snoozesoft...

I have redesigned my entry, same method but a single temporal file, it looks promising.....
Antoni
Reply
#35
I've posted my QB1.0 IDE program
http://www.network54.com/Forum/190883/me...146755103/

Anyone gung ho could take the code and modify it to use 8 or more work files rather than the 4 I use. It would cut merge time 50%.

McLazy
Reply
#36
Thanks, Mac!
We have so far 2 entries for different categories, so two winners.


My entries (QBasic and FB) will be ready this week end, I'm a very slow developer...

Is anyone else planning to enter the contest?
Antoni
Reply
#37
Ok, here is my entry for the QB1.1 category.
I'm using a radix sort on the disk, using two auxiliar files, one for input, the other for output. The actual sorting is made in memory, in chunks as big as possible. No aditional merge passes are required.
Due to the ASCII variable length formatting of the input file of this category, there is an aditional de-formatting pass at the start and a reformatting pass at the end.

Code:
'Antoni's entry for the QB1.1 category
'External sort challenge QBN may 2006
'--------------------------------------------------
DEFINT A-Z
DECLARE SUB putform (digit%)
DECLARE SUB radix (i%, max%)
DECLARE SUB tallysmall (max%, i%)
DECLARE SUB tallybig (max%)
DECLARE SUB putunf (max%)
DECLARE SUB putstraight (max%)
DECLARE SUB acumbig ()
DECLARE FUNCTION getunf% ()
DECLARE FUNCTION getform% ()

'Qbasic external sort compo entry by Antoni Gual
'uses radix sort

CLEAR
t! = TIMER

'data buffers
CONST maxi = 10920
DIM SHARED s1(maxi + 1) AS STRING * 6
DIM SHARED s2(maxi + 1) AS STRING * 6

'mem tallies
DIM SHARED c(-1 TO 9)  AS INTEGER
DIM SHARED d(-1 TO 9) AS INTEGER

'file tallies
DIM SHARED cd(9, 5)  AS LONG

f1$ = "temp1.txt"
f2$ = "temp2.txt"

PRINT "sorting a file of 1M registers": PRINT

'reformat data to right-aligned
OPEN "qb1data.txt" FOR INPUT AS #1
OPEN f1$ FOR OUTPUT AS #2

'poke is used to avoid the horrible asc(mid$(a$,i,1))
DEF SEG = VARSEG(s1(0))

'preformat and tally the file offsets

WHILE NOT EOF(1)
   f1 = getunf
   tallybig (f1)
   putstraight (f1)
WEND
acumbig
CLOSE
PRINT TIMER - t!

'radix sort
FOR i = 5 TO 0 STEP -1
    OPEN f1$ FOR INPUT AS #1
    OPEN f2$ FOR OUTPUT AS #2
    WHILE NOT EOF(1)
      f1 = getform
      tallysmall f1, i
      radix i, f1
      putform (i)
    WEND
    CLOSE
    PRINT "pass ";6-i;" of 6 ";TIMER - t!
    SWAP f1$, f2$
NEXT

'format back
KILL f2$

OPEN f1$ FOR INPUT AS #1
OPEN "sort.txt" FOR OUTPUT AS #2
WHILE NOT EOF(1)
   putunf getform
WEND
CLOSE

KILL f1$
PRINT : PRINT "File sorted in "; TIMER - t!; "seconds"
SYSTEM

'-------------------------------------------------
SUB acumbig
FOR i = 0 TO 5
i1& = 0
  FOR j = 0 TO 9
    i2& = cd(j, i): cd(j, i) = i1& * 8 + 1: i1& = i1& + i2&
  NEXT
NEXT
END SUB

'--------------------------------------------
FUNCTION getform
        FOR j = 0 TO maxi
          LINE INPUT #1, s1(j)
          IF EOF(1) THEN EXIT FOR
        NEXT
    IF j < maxi THEN getform = j ELSE getform = maxi
END FUNCTION

'--------------------------------------------
FUNCTION getunf
'gets to the output buffer!!!
FOR j = 0 TO maxi
  LINE INPUT #1, t$
  RSET s1(j) = LTRIM$(RTRIM$(t$))
  IF EOF(1) THEN EXIT FOR
NEXT
'PRINT "R";
IF j < maxi THEN getunf = j ELSE getunf = maxi
END FUNCTION

'---------------------------------------------
SUB putform (digit)
'copy 10 mem buffers to 10 offsets of temp file
FOR i = 0 TO 9
  SEEK #2, cd(i, digit)
  FOR k = d(i) TO c(i) - 1
   PRINT #2 + j, s2(k)
  NEXT
  cd(i, digit) = SEEK(2)
NEXT
END SUB

'----------------------------------------------
SUB putstraight (max)
'copy a sorted memory buffer to disk
FOR k = 0 TO max
   PRINT #2, s1(k)
NEXT
END SUB

'----------------------------------------------
SUB putunf (max)
'copy a sorted memory buffer to disk
CONST sp$ = " "
FOR k = 0 TO max
   PRINT #2, sp$ + LTRIM$(s1(k)) + sp$
NEXT
END SUB

'-----------------------------------------------------
SUB radix (digit, max)
offs& = VARPTR(s1(0)) + digit
FOR j = 0 TO max
  t1 = PEEK(offs&)
  IF t1 > 32 THEN t1 = t1 - 48 ELSE t1 = 0
  s2(c(t1)) = s1(j)
  c(t1) = c(t1) + 1
  offs& = offs& + 6
NEXT
END SUB

'----------------------------------------------
SUB tallybig (max)
offs& = VARPTR(s1(0))
FOR j = 0 TO max
   FOR k = 5 TO 0 STEP -1
     t = PEEK(offs& + k)
     IF t > 32 THEN t = t - 48 ELSE t = 0
     cd(t, k) = cd(t, k) + 1
   NEXT
   offs& = offs& + 6
NEXT
END SUB

'--------------------------------------------------
SUB tallysmall (max, digit)
offs& = VARPTR(s1(0)) + digit
ERASE c
FOR j = 0 TO max
     t = PEEK(offs&)
     IF t > 32 THEN t = t - 48 ELSE t = 0
     c(t) = c(t) + 1
     offs& = offs& + 6
NEXT
i1 = 0
FOR j = 0 TO 9
    i2 = c(j): c(j) = i1: d(j) = i1: i1 = i1 + i2
NEXT
END SUB
Antoni
Reply
#38
My entry for the QB4.5
Same idea, here we skip the reformatting steps, as it is a binary file, and we have only 4 passes of radix sort.
Code:
'Antoni's entry for the QB4.5 category
'QBN Forums External Sort Challenge 5 -2006
'---------------------------------------------
DEFINT A-Z
DECLARE SUB putform (digit%)
DECLARE SUB radix (i%, max%)
DECLARE SUB tallysmall (max%, i%)
DECLARE SUB tallybig (max%)
DECLARE SUB putunf (max%)
DECLARE SUB putstraight (max%)
DECLARE SUB acumbig ()
DECLARE FUNCTION getunf% ()
DECLARE FUNCTION getform% ()

'Qbasic external sort compo entry by Antoni Gual
'uses radix sort

CLEAR
t! = TIMER
'KILL "output.dat"
'data buffers
CONST regsize = 4
CONST passes = 3
CONST numkeys = 255
CONST maxi = 16382
REDIM SHARED s1(maxi) AS LONG
REDIM SHARED s2(maxi) AS LONG

'mem tallies
DIM SHARED c(numkeys)  AS INTEGER
DIM SHARED d(numkeys) AS INTEGER

'file tallies
DIM SHARED cd(numkeys, passes) AS LONG

f1$ = "temp1.dat"
f2$ = "temp2.dat"

'reformat data to right-aligned
OPEN "unsortqb.dat" FOR binary  AS #1
PRINT "Sorting a file of "; LOF(1)\regsize; " records"

'OPEN f1$ FOR binary AS #2

'poke is used to avoid the horrible asc(mid$(a$,i,1))
DEF SEG = VARSEG(s1(0))

'preformat and tally the file offsets

do
     f1 = getform
     tallybig (f1)
     'putstraight (f1)
loop until eof(1)
acumbig
CLOSE
PRINT TIMER - t!
'END

'radix sort
FOR i = 0 TO passes
  if i=0 then
    OPEN "unsortqb.dat" FOR binary  AS #1
  else
      OPEN f1$ FOR binary ACCESS READ AS #1
  end if
    OPEN f2$ FOR binary ACCESS WRITE AS #2
    WHILE NOT EOF(1)
        f1 = getform
        tallysmall f1, i
        radix i, f1
        putform (i)
    WEND
    CLOSE
    PRINT "pass "; i+1; " of 4 "; TIMER - t!
    'END
    SWAP f1$, f2$
NEXT

'format back
NAME f1$ AS "sort.dat"
kill f2$
PRINT : PRINT "File sorted in "; TIMER - t!; "seconds"
SYSTEM

'-------------------------------------------------
SUB acumbig
FOR i = 0 TO passes
i1& = 0
    FOR j = 0 TO numkeys
        i2& = cd(j, i): cd(j, i) = i1&*regsize+1: i1& = i1& + i2&
    NEXT
NEXT
END SUB

'--------------------------------------------
FUNCTION getform
        FOR j = 0 TO maxi
            GET #1,,s1(j)
            IF EOF(1) THEN EXIT FOR
        NEXT
        IF j =< maxi THEN getform = j-1 ELSE getform = maxi
END FUNCTION

'---------------------------------------------
SUB putform (digit)
'copy 10 mem buffers to 10 offsets of temp file
FOR i = 0 TO numkeys
    SEEK #2, cd(i, digit)
    FOR k = d(i) TO c(i) - 1
      PUT #2,,s2(k)
    NEXT
    cd(i, digit)=SEEK(2)
NEXT
END SUB

'----------------------------------------------
SUB putstraight (max)
'copy buffer to disk
FOR k = 0 TO max
     PUT #2,,s1(k)
NEXT
END SUB

'------------------------------------------------
SUB radix (digit, max)
offs& = VARPTR(s1(0)) + digit
FOR j = 0 TO max
    t = PEEK(offs&)
    s2(c(t)) = s1(j)
    c(t) = c(t) + 1
    offs& = offs& + regsize
NEXT
END SUB

'--------------------------------------------------
SUB tallybig (max)
offs& = VARPTR(s1(0))
FOR j = 0 TO max
     FOR k = 0 TO passes
         t = PEEK(offs& + k)
         cd(t, k) = cd(t, k) + 1
     NEXT
     offs& = offs& + regsize
NEXT
END SUB

'--------------------------------------------------
SUB tallysmall (max, digit)
offs& = VARPTR(s1(0)) + digit
ERASE c
FOR j = 0 TO max
         t = PEEK(offs&)
         c(t) = c(t) + 1
         offs& = offs& + regsize
NEXT
i1 = 0
FOR j = 0 TO numkeys
        i2 = c(j): c(j) = i1: d(j) = i1: i1 = i1 + i2
NEXT
END SUB
Antoni
Reply
#39
i was going to do one using binary trees, but it seemed pretty slow. i just gave up
Reply
#40
Well, you can't tell your idea is not worth the pain until you explore it in depth.

My first tries at radix sort a week ago were slower than the "classical" yetifoot's and mac's entries, but after playing with it a little, now I'm beating at least one of them....

It's curious how I found the solution by fiddling with the QB1.1 version, and not with FB's....
Antoni
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)