Posts: 1,407
Threads: 117
Joined: Dec 2002
Yes , in the case of the test files, the stability is not an issue because there are no more fields in the record(thanks, Moneo) , and no other order than the sorting field's exists.
BTW: My "great" sorting method seems to require double time than yetifoot's implementation of the classical method. I'll post it when I finish debugging it. Now it adds a single record to the file, after 80 operations, go figure where it comes from..
Antoni
Posts: 243
Threads: 12
Joined: Aug 2001
Quote:For an unstable sort, we would have to sort on 2 keys, the zipcode as major and the name as minor --- more time.
Thanks! I focused on the single element sort and completely forgot about possible other elements.
Mac
Quote:BTW: My "great" sorting method seems to require double time [...] go figure where it comes from..
Microsoft? lol
Posts: 1,407
Threads: 117
Joined: Dec 2002
Better snoozesoft...
I have redesigned my entry, same method but a single temporal file, it looks promising.....
Antoni
Posts: 243
Threads: 12
Joined: Aug 2001
I've posted my QB1.0 IDE program
http://www.network54.com/Forum/190883/me...146755103/
Anyone gung ho could take the code and modify it to use 8 or more work files rather than the 4 I use. It would cut merge time 50%.
McLazy
Posts: 1,407
Threads: 117
Joined: Dec 2002
Thanks, Mac!
We have so far 2 entries for different categories, so two winners.
My entries (QBasic and FB) will be ready this week end, I'm a very slow developer...
Is anyone else planning to enter the contest?
Antoni
Posts: 1,407
Threads: 117
Joined: Dec 2002
Ok, here is my entry for the QB1.1 category.
I'm using a radix sort on the disk, using two auxiliar files, one for input, the other for output. The actual sorting is made in memory, in chunks as big as possible. No aditional merge passes are required.
Due to the ASCII variable length formatting of the input file of this category, there is an aditional de-formatting pass at the start and a reformatting pass at the end.
Code: 'Antoni's entry for the QB1.1 category
'External sort challenge QBN may 2006
'--------------------------------------------------
DEFINT A-Z
DECLARE SUB putform (digit%)
DECLARE SUB radix (i%, max%)
DECLARE SUB tallysmall (max%, i%)
DECLARE SUB tallybig (max%)
DECLARE SUB putunf (max%)
DECLARE SUB putstraight (max%)
DECLARE SUB acumbig ()
DECLARE FUNCTION getunf% ()
DECLARE FUNCTION getform% ()
'Qbasic external sort compo entry by Antoni Gual
'uses radix sort
CLEAR
t! = TIMER
'data buffers
CONST maxi = 10920
DIM SHARED s1(maxi + 1) AS STRING * 6
DIM SHARED s2(maxi + 1) AS STRING * 6
'mem tallies
DIM SHARED c(-1 TO 9) AS INTEGER
DIM SHARED d(-1 TO 9) AS INTEGER
'file tallies
DIM SHARED cd(9, 5) AS LONG
f1$ = "temp1.txt"
f2$ = "temp2.txt"
PRINT "sorting a file of 1M registers": PRINT
'reformat data to right-aligned
OPEN "qb1data.txt" FOR INPUT AS #1
OPEN f1$ FOR OUTPUT AS #2
'poke is used to avoid the horrible asc(mid$(a$,i,1))
DEF SEG = VARSEG(s1(0))
'preformat and tally the file offsets
WHILE NOT EOF(1)
f1 = getunf
tallybig (f1)
putstraight (f1)
WEND
acumbig
CLOSE
PRINT TIMER - t!
'radix sort
FOR i = 5 TO 0 STEP -1
OPEN f1$ FOR INPUT AS #1
OPEN f2$ FOR OUTPUT AS #2
WHILE NOT EOF(1)
f1 = getform
tallysmall f1, i
radix i, f1
putform (i)
WEND
CLOSE
PRINT "pass ";6-i;" of 6 ";TIMER - t!
SWAP f1$, f2$
NEXT
'format back
KILL f2$
OPEN f1$ FOR INPUT AS #1
OPEN "sort.txt" FOR OUTPUT AS #2
WHILE NOT EOF(1)
putunf getform
WEND
CLOSE
KILL f1$
PRINT : PRINT "File sorted in "; TIMER - t!; "seconds"
SYSTEM
'-------------------------------------------------
SUB acumbig
FOR i = 0 TO 5
i1& = 0
FOR j = 0 TO 9
i2& = cd(j, i): cd(j, i) = i1& * 8 + 1: i1& = i1& + i2&
NEXT
NEXT
END SUB
'--------------------------------------------
FUNCTION getform
FOR j = 0 TO maxi
LINE INPUT #1, s1(j)
IF EOF(1) THEN EXIT FOR
NEXT
IF j < maxi THEN getform = j ELSE getform = maxi
END FUNCTION
'--------------------------------------------
FUNCTION getunf
'gets to the output buffer!!!
FOR j = 0 TO maxi
LINE INPUT #1, t$
RSET s1(j) = LTRIM$(RTRIM$(t$))
IF EOF(1) THEN EXIT FOR
NEXT
'PRINT "R";
IF j < maxi THEN getunf = j ELSE getunf = maxi
END FUNCTION
'---------------------------------------------
SUB putform (digit)
'copy 10 mem buffers to 10 offsets of temp file
FOR i = 0 TO 9
SEEK #2, cd(i, digit)
FOR k = d(i) TO c(i) - 1
PRINT #2 + j, s2(k)
NEXT
cd(i, digit) = SEEK(2)
NEXT
END SUB
'----------------------------------------------
SUB putstraight (max)
'copy a sorted memory buffer to disk
FOR k = 0 TO max
PRINT #2, s1(k)
NEXT
END SUB
'----------------------------------------------
SUB putunf (max)
'copy a sorted memory buffer to disk
CONST sp$ = " "
FOR k = 0 TO max
PRINT #2, sp$ + LTRIM$(s1(k)) + sp$
NEXT
END SUB
'-----------------------------------------------------
SUB radix (digit, max)
offs& = VARPTR(s1(0)) + digit
FOR j = 0 TO max
t1 = PEEK(offs&)
IF t1 > 32 THEN t1 = t1 - 48 ELSE t1 = 0
s2(c(t1)) = s1(j)
c(t1) = c(t1) + 1
offs& = offs& + 6
NEXT
END SUB
'----------------------------------------------
SUB tallybig (max)
offs& = VARPTR(s1(0))
FOR j = 0 TO max
FOR k = 5 TO 0 STEP -1
t = PEEK(offs& + k)
IF t > 32 THEN t = t - 48 ELSE t = 0
cd(t, k) = cd(t, k) + 1
NEXT
offs& = offs& + 6
NEXT
END SUB
'--------------------------------------------------
SUB tallysmall (max, digit)
offs& = VARPTR(s1(0)) + digit
ERASE c
FOR j = 0 TO max
t = PEEK(offs&)
IF t > 32 THEN t = t - 48 ELSE t = 0
c(t) = c(t) + 1
offs& = offs& + 6
NEXT
i1 = 0
FOR j = 0 TO 9
i2 = c(j): c(j) = i1: d(j) = i1: i1 = i1 + i2
NEXT
END SUB
Antoni
Posts: 1,407
Threads: 117
Joined: Dec 2002
My entry for the QB4.5
Same idea, here we skip the reformatting steps, as it is a binary file, and we have only 4 passes of radix sort.
Code: 'Antoni's entry for the QB4.5 category
'QBN Forums External Sort Challenge 5 -2006
'---------------------------------------------
DEFINT A-Z
DECLARE SUB putform (digit%)
DECLARE SUB radix (i%, max%)
DECLARE SUB tallysmall (max%, i%)
DECLARE SUB tallybig (max%)
DECLARE SUB putunf (max%)
DECLARE SUB putstraight (max%)
DECLARE SUB acumbig ()
DECLARE FUNCTION getunf% ()
DECLARE FUNCTION getform% ()
'Qbasic external sort compo entry by Antoni Gual
'uses radix sort
CLEAR
t! = TIMER
'KILL "output.dat"
'data buffers
CONST regsize = 4
CONST passes = 3
CONST numkeys = 255
CONST maxi = 16382
REDIM SHARED s1(maxi) AS LONG
REDIM SHARED s2(maxi) AS LONG
'mem tallies
DIM SHARED c(numkeys) AS INTEGER
DIM SHARED d(numkeys) AS INTEGER
'file tallies
DIM SHARED cd(numkeys, passes) AS LONG
f1$ = "temp1.dat"
f2$ = "temp2.dat"
'reformat data to right-aligned
OPEN "unsortqb.dat" FOR binary AS #1
PRINT "Sorting a file of "; LOF(1)\regsize; " records"
'OPEN f1$ FOR binary AS #2
'poke is used to avoid the horrible asc(mid$(a$,i,1))
DEF SEG = VARSEG(s1(0))
'preformat and tally the file offsets
do
f1 = getform
tallybig (f1)
'putstraight (f1)
loop until eof(1)
acumbig
CLOSE
PRINT TIMER - t!
'END
'radix sort
FOR i = 0 TO passes
if i=0 then
OPEN "unsortqb.dat" FOR binary AS #1
else
OPEN f1$ FOR binary ACCESS READ AS #1
end if
OPEN f2$ FOR binary ACCESS WRITE AS #2
WHILE NOT EOF(1)
f1 = getform
tallysmall f1, i
radix i, f1
putform (i)
WEND
CLOSE
PRINT "pass "; i+1; " of 4 "; TIMER - t!
'END
SWAP f1$, f2$
NEXT
'format back
NAME f1$ AS "sort.dat"
kill f2$
PRINT : PRINT "File sorted in "; TIMER - t!; "seconds"
SYSTEM
'-------------------------------------------------
SUB acumbig
FOR i = 0 TO passes
i1& = 0
FOR j = 0 TO numkeys
i2& = cd(j, i): cd(j, i) = i1&*regsize+1: i1& = i1& + i2&
NEXT
NEXT
END SUB
'--------------------------------------------
FUNCTION getform
FOR j = 0 TO maxi
GET #1,,s1(j)
IF EOF(1) THEN EXIT FOR
NEXT
IF j =< maxi THEN getform = j-1 ELSE getform = maxi
END FUNCTION
'---------------------------------------------
SUB putform (digit)
'copy 10 mem buffers to 10 offsets of temp file
FOR i = 0 TO numkeys
SEEK #2, cd(i, digit)
FOR k = d(i) TO c(i) - 1
PUT #2,,s2(k)
NEXT
cd(i, digit)=SEEK(2)
NEXT
END SUB
'----------------------------------------------
SUB putstraight (max)
'copy buffer to disk
FOR k = 0 TO max
PUT #2,,s1(k)
NEXT
END SUB
'------------------------------------------------
SUB radix (digit, max)
offs& = VARPTR(s1(0)) + digit
FOR j = 0 TO max
t = PEEK(offs&)
s2(c(t)) = s1(j)
c(t) = c(t) + 1
offs& = offs& + regsize
NEXT
END SUB
'--------------------------------------------------
SUB tallybig (max)
offs& = VARPTR(s1(0))
FOR j = 0 TO max
FOR k = 0 TO passes
t = PEEK(offs& + k)
cd(t, k) = cd(t, k) + 1
NEXT
offs& = offs& + regsize
NEXT
END SUB
'--------------------------------------------------
SUB tallysmall (max, digit)
offs& = VARPTR(s1(0)) + digit
ERASE c
FOR j = 0 TO max
t = PEEK(offs&)
c(t) = c(t) + 1
offs& = offs& + regsize
NEXT
i1 = 0
FOR j = 0 TO numkeys
i2 = c(j): c(j) = i1: d(j) = i1: i1 = i1 + i2
NEXT
END SUB
Antoni
i was going to do one using binary trees, but it seemed pretty slow. i just gave up
Posts: 1,407
Threads: 117
Joined: Dec 2002
Well, you can't tell your idea is not worth the pain until you explore it in depth.
My first tries at radix sort a week ago were slower than the "classical" yetifoot's and mac's entries, but after playing with it a little, now I'm beating at least one of them....
It's curious how I found the solution by fiddling with the QB1.1 version, and not with FB's....
Antoni
|