Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
External sort compo?
#11
Here are a set of rules. Excuse the delay. Any suggestion?
And if everyone agrees, let's go!
Quote:Rules for QBN external sort compo May 2006

The goal is to design the fastest program to sort an array of data that does'nt fit in the computer's memory.

No prizes, unless we find a sponsor Big Grin

Deadline, may 29 2006.

There are two categories:
-QuickBasic 4.5
-FreeBasic

QuickBasic entries must sort an array of 1,000,000 (1E6) LONGs without using EMS or XMS or a RAM drive.

FreeBASIC entries must sort an array of 100,000,000 (1E8) INTEGERS, not being allowed to use more than 100Mb of memory (use it at your risk, my PC could start swapping to memory after that). No memory RAM allowed.

No external libraries are allowed, except QB.LIB for QB and CRT.LIB for FB.

No shelling to external programs is allowed.

The entries must be in BASIC source code (so we can learn from our errors), to be compiled locally with QB4.5 or FB 0.15 for Windows

Details:

The input of the program will be the file "unsortfb.dat" or "unsortqb.dat" containing the array to be sorted, the output will be a different file "sort.dat" containing all the registers sorted in increasing order.

The input file can not be overwritten and the output file must have the same size as the input file.

Auxiliar files and directories up to 2GB can be created and must be erased before the end of the program.

The executable and the input,output and auxiliar files will be in the same disk. Absolute paths in the code will probably not work.

The program must terminate by itself, not requiring the use to press a key (to allow me to use a batch file to run them).

The input files can be generated in place with the following programs:

For QB expect a running time of 10 seconds.
Code:
'qbcrea.bas
DIM T!, i, j, b$
b$ = "|\                                     \| ###% Done "

T! = TIMER
OPEN "unsortqb.dat" FOR RANDOM AS #1 LEN = 1000
PRINT "Creating unsortqb.dat."
DIM a AS STRING * 1000
DEF SEG = VARSEG(a)
s& = VARPTR(a)
FOR i = 1 TO 4000
  
  FOR j& = 0 TO 999
    if (j& and 3&) =3& then
     POKE s& + j&, (RND * 127)
    else
     POKE s& + j&, (RND * 256)
    end if
  NEXT
  PUT #1, i, a
  IF (i MOD 100) = 0 THEN
    LOCATE 2, 1
    PRINT USING b$; STRING$(i \ 100, "."); i \ 40;
   END IF
NEXT
CLOSE
PRINT
PRINT "File created in ";TIMER - T!;" seconds. ";LOF(1);" bytes size"
PRINT "Press a key"
SLEEP


For FreeBasic expext a running time of 1.5 minutes.
Code:
'fbcrea.bas
option explicit
dim t!,i,j ,b$,cnt&,l&
b$="|\                                     \| ###% Done "

t!=timer
open "unsortfb.dat" for random as #1 len=1000
print "Creating unsortfb.dat."
dim a as string*1000
for i=1 to 400000
  for j=0 to 999
    if (j and 3)=3 then
     a[j]=int(rnd*128)
    else
     a[j]=int(rnd*256)
    end if
  next
  put #1,i,a
  if (i mod 10000)=0 then
    locate 2,1
    print using b$;string(I\10000,".");i\4000;
   end if
next
CLOSE
PRINT
PRINT "File created in ";TIMER - T!;" seconds. ";LOF(1);" bytes size"
PRINT "Press a key"
SLEEP


Testing:

-The testing will take place in my (Antoni's) computer, if no one else volunteers.
P4 1,44 with 256Mb RAM, W2000 SP4

-Every entry will be compiled and put in a separate folder with its input data.
-Disk will be defragged, antivirus will be stopped.
-Each entry will be run, timed and the results checked for accuracy.

-If an entry takes all night to run, stops waiting for a key ,forgets to erase auxiiar files or damages my computer it will be disqualified and you will hear me ranting for weeks.

-The testing for the correctness of the output file will be made by comparison of all outputs in the same category using DOS's FC. If one output does not match, it will be suspect...

This program can be used too to test the sorting
Code:
open "sort.dat" for random as #1 len =4
dim l&,l1&,cnt&
get #1,,l&
cnt&=1
do
get #1,,l1&
if l&>l1& then
print "Error: ";l&;" > ";l1&;" at position ";cnt&
exit do
end if
l&=l1&
loop until eof(1)  
close
Antoni
Reply
#12
Hi Antoni, I will probably try and enter. I wondered if you could clarify something for me?

You say that if using FB, that CRT lib can be used. Does this mean I can use the functions such as fopen, which would give me a speed boost over Open As, or do you just say that because CRT lib is always linked anyway?

thanks.
EVEN MEN OF STEEL RUST.
[Image: chav.gif]
Reply
#13
Is the sort supposed to be stable? (if two integers are equal, will their position relative to each other need to be maintained?)
stylin:
Reply
#14
I have written a program now using FreeBASIC. It takes 400 seconds to complete. The resultant file passes the test you posted in the rules. I have not tested if it actually is the correct data. The resultant file has a CRC32 (according to WinRAR/WinZip) of 61779A56, I hope this is correct! I am going to work on the program a bit more tomorrow before I submit it.
EVEN MEN OF STEEL RUST.
[Image: chav.gif]
Reply
#15
Quote:Is the sort supposed to be stable? (if two integers are equal, will their position relative to each other need to be maintained?)
Excellent question, Stylin.
If you were to use a QuickSort type of algorithm, then you cannot guarantee that the sort will be stable. That's the thing about QuickSort that is most annoying to me.
*****
Reply
#16
stylin:
In this case stability is not important: if the registers are just integers, and they are equal, how can you tell which is which?

yetifoot:
Yes you can use the crt input-output.
Antoni
Reply
#17
OK, here is my entry.

http://www.streetcds.co.uk/comp/yetisort.bas

Has anyone got a good sort program to test it against? Someone suggested Ben Bakers QSort, but it looks like its for text only.
EVEN MEN OF STEEL RUST.
[Image: chav.gif]
Reply
#18
Quote:stylin:
In this case stability is not important: if the registers are just integers, and they are equal, how can you tell which is which?
You can't tell the difference in the resulting output file, but you can when you're actually doing the sort. It makes it easier without having to worry about stability.
stylin:
Reply
#19
Well, I got an idea this morning that makes my entry very simple, just 130 lines of code.
I'm debugging it now, I hope it's fast enough.
Antoni
Reply
#20
Quote:The input files can be generated in place with the following programs:

I don't understand "can be". Why isn't this "must be"? Then we are all sorting the same exact file.

Mac

P.S. Also I did not understand someone else's question that seem to reduce to
163532
If I sort this to
123356
does it matter which 3 comes first?

Hunhh? Whoever it was: what are you talking about? 3=3. No such concept in my mind of which comes first.

Mac
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)