Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
quicksort undo help.
#1
I had an interesting idea but first I would like your help on this:

How can I embed exact undo code in here? I was hoping that one could undo the exact quicksort steps with as little memory use as possible..

(I do NOT request an example of a linked array that holds the former position of each variable)

Also, it would be nice if there was a sub that actually performed the undo.

Code:
SUB qsort.integer.lowstart (array1() AS INTEGER, a.max%)
DIM g2(1 TO a.max%) AS INTEGER, h2(1 TO a.max%) AS INTEGER, i AS INTEGER, j AS INTEGER, r AS INTEGER, E AS INTEGER, g AS INTEGER, h AS INTEGER, k AS INTEGER
E = 1: g2(1) = 1: h2(1) = a.max%
f1: g = g2(E): h = h2(E)
f2: i = g: j = h: r = (g + h) \ 2: k = array1(r)
f3: IF array1(i) < k THEN i = i + 1: GOTO f3
f4: IF array1(j) > k THEN j = j - 1: GOTO f4
IF i <= j THEN SWAP array1(i), array1(j): i = i + 1: j = j - 1: IF i <= j THEN GOTO f3
IF j - g + i < h THEN
IF i < h THEN g2(E) = i: h2(E) = h: E = E + 1
h = j
ELSE
IF g < j THEN g2(E) = g: h2(E) = j: E = E + 1
g = i
END IF
IF g < h THEN GOTO f2 ELSE E = E - 1: IF E THEN GOTO f1
ERASE g2, h2
END SUB
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#2
Im not really sure I understand what you want to do, but a simple compact way of implementing an undo feature is to have a stack that has each operation pushed onto it as the quicksort runs. For example the stack would contain which two elements were swapped at each step, ie:
Code:
1, 5
3, 7
2, 9

To undo a step, pop the top element of the stack and swap the elements back. Why exactly do you want to undo individual steps in a quicksort run?
esus saves.... Passes to Moses, shoots, he scores!
Reply
#3
Well, I was hoping that it could be somehow a binary stack so that the conversion back is NLOG(N) bits (or rather how many times it takes to swap them)

I had an idea of using quicksort to create a compression:

1) sort everything
2) compress the sort by storing either number of bits added to each succeeding number, or by getting rid of duplicates, depending on the distribution of the numbers. Also have a way to un-sort things without taking up the same amount of space as it takes to store the unsorted numbers...
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#4
I think you will need to store at least the two indicies which are swapped at each step. You could of course compress the stack of numbers. You could lower the number of bits you use to store each element of a swap, ie if you are sorting 16 numbers then you only need 4 bits per element, which gives you one byte per swap.

Quote:Also have a way to un-sort things without taking up the same amount of space as it takes to store the unsorted numbers...

Why not just compress the unsorted numbers?
esus saves.... Passes to Moses, shoots, he scores!
Reply
#5
Quote: Why not just compress the unsorted numbers?

Sorted numbers compress much better than unsorted numbers for a multitude of reasons. For instance, assume that you had a few million numbers stored as unsigned integers in your sorted file. Now, unsigned integers are 0 to 65535, so there will likely be a whole range of them. If you get so lucky as to have a minimum difference of 7, 3, 2, or in the best case 1, you would then only need to use 4, 3, 2, or even 1 bit per integer.

Using the method I described in neozones a few days ago, you can further reduce the size.

Quote: which gives you one byte per swap.

One byte per swap? I'm confused. Are you calculating the mem requirements of same method or using a different one?

Now, since a quicksort is N in the best case (I think), you would be saving space if the sort was less than the average sort size, Nlog(N), assuming that the average sort required the same storage amount as the actual file.
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#6
Quote:Sorted numbers compress much better than unsorted numbers for a multitude of reasons.

Yes, but you want to store the unsorted numbers. The sorted numbers, plus the steps to swap them back to the unsorted numbers probably won't compress as much as the original unsorted numbers.

Quote:One byte per swap? I'm confused. Are you calculating the mem requirements of same method or using a different one?
Each swap consists of two elements which are the two indicies which are to be swapped. You could just store each element as, say a 32bit integer, but that would be wasteful for small sets of data. In the example I gave, for a set of 16 numbers, you only need 4bits to store each element in the swap, giving a total of one byte per swap. The actual memory requirements will of course vary depending on the size of the set to be sorted.

Quote:Now, unsigned integers are 0 to 65535, so there will likely be a whole range of them. If you get so lucky as to have a minimum difference of 7, 3, 2, or in the best case 1, you would then only need to use 4, 3, 2, or even 1 bit per integer.

Thats not a great assumption to make, in the worst case you would have a difference of 65535, where all of the numbers are either 0 or 65535. They would compress nicely (sorted or unsorted) using conventional compression methods, but using you difference of two consecutive numbers method you wouldn't achieve any compression at all.
esus saves.... Passes to Moses, shoots, he scores!
Reply
#7
Nononono, you don't understand. You can't have such large differences. The whole point is that a sorted array can be compressed much more than a non-sorted array.

***WARNING LONG POST AHEAD***

Example:

0 0 1 2 3 3 3 3 4 4 4 5 6 7 7 7... 255

If each number used 8 bits, and you have 1000 numbers, you'd have 8000 bits.

Now, suppose that you just take the first number:

0

...and then you just record by how much the next number changed:

0 1 1 0 0 0 ... 1

...and also record how many bits the change will cost us, based on the maximum change, which will take 8 bits as well.

So that's 16 bits, and then 998 bits: 1014 bits, or a bit more than 1/8th compression. The difference is huge!

---------------------------------

Now, if one were able to store the swap sequence efficiently, then one would get a very good compression. I don't think that storing the exact swap indexes is particularly efficient, though, although it could allow for some compression. In a quicksort, the range between the swap indexes is smaller than just a random range.

---------------------------------
---------------------------------

Regardless, though, I just came to the conclusion that simply storing the previous un-sorted index array is good enough for a test, provided that the total memory these indexes take up is smaller than the total memory the file takes up.

Let's take a realistic example: let's compress one meg, which is, I think, 8388608 (2^23) bits of data.

Let's have each number take up 32 bits. The actual array and the index array must have 262144 elements. (total bits/total-bits-per-number = 2^23/2^5 = 2^(23-5) = 2^18 = 262144)

The actual array has 32-bit numbers, and the index array has 18-bit numbers. The index array therefore takes up, as a certainty, 4718592 (18*262144) bits of memory.

Follow?

Now suppose the array is sorted.

--------------

Statistically, what is the average jump between the numbers of a "perfect" random array?
The range of each number is 0 to 4294967296 (2^32-1). The average change must be 2^32/2^18 = 2^14 = 16384. (total range/total amount of elements).

16384 takes up 14 bits.

So if we had this perfect "random" array, neglecting some other compression ideas for sorts that I had, you'd have to use 3670052 (64+14*(2^18-2)) bits.

8388644 bits now, and 8388608 bits before.
So we have therefore saved... absolutely nothing. We've actually gained 36 bits.

--------------

HOWEVER, that's absolutely OK! With the knowledge that it was a perfect array, we can assume that the first array was 0 and we have actually gained 0 bits! ('course we could have gained a lot more bits with that knoweledge)

--------------

It is true that we have gained nothing. HOWEVER, we can take advantage of non-random distribution and reduce the size by taking the difference of the difference, and so on! LADIES AND GENTLEMEN, this is the most important part!

The average of the differences was 16384. The average of the average of the differences is 0! ZERO! We can use one bit to represent that difference!

First, we have:

[0 (32-bit) , 16384, 16384....] = 3670034 bits.

NOW, in the second iteration, we have:

[0 (32-bit) , 16384 (32-bit), 0 (14-bit), 0, 0, 0, 0, 0, 0....] = bits.

That is: 262219 (64+14+1*(2^18-3)) bits. Now the compression ratio is:

(262219+4718592 = 4980811) / 8388608 = 0.59 = 41% saved!

----------------

ONE MORE CONSIDERATION:

The iteration could continue until no more bits could be saved. To figure out exactly how many iterations there were, 18 more bits (at least in this example) should be sacrificed.

----------------

CONFUSING? I think so!
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#8
blarg never mind.
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#9
O_o
Reply
#10
Hey Agamemnus...this is off-topic and all that, and don't take this the wrong way but...you sure do ask some weird questions Big Grin
I'd knock on wood, but my desk is particle board.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)