The QBasic Forum      Other Subforums, Links and Downloads
  << Previous Topic | Next Topic >>Return to Index  

Shuffle Sort algorithm - unique random numbers with no duplicates

March 23 2008 at 1:10 AM
Solitaire  (Login Solitaire1)
 - 
from IP address 24.90.213.75

DIM x AS INTEGER, y AS INTEGER, z AS INTEGER, E AS STRING
DIM mix AS INTEGER, temp AS INTEGER
DIM MyNum(20) AS INTEGER
CLS : PRINT "Random array of 20 number values without repeats"
PRINT "Enter to list numbers in a different order or Q to quit"

FOR x = 1 TO 20          'assign sequential value to array elements
    MyNum(x) = x
NEXT x
RANDOMIZE TIMER
DO
    FOR y = 1 TO 20          'swap array positions randomly
        mix = INT(RND * 20) + 1
        temp = MyNum(mix)
        MyNum(mix) = MyNum(y)
        MyNum(y) = temp
    NEXT y
    PRINT : PRINT
    FOR z = 1 TO 20          'print array value in new order
        PRINT MyNum(z);
    NEXT z
    E$ = INPUT$(1)
LOOP UNTIL UCASE$(E$) = "Q"
SYSTEM


 
 Respond to this message   
AuthorReply
Solitaire
(Login Solitaire1)
 - 
24.90.213.75

Select & discard algorithm for unique random numbers

March 23 2008, 1:16 AM 

Put the numbers into a hat, mix them up, and pull them out one by one (without looking), discarding each used number.

There are various algorithms for implementing such a scenario. Most involve use of an array. The following techniques are for creating a set of unique random selections from a range without duplicates:


1- One way is a random mix (reordering) of the array indexes and selecting the numbers in order of the rearranged array index. This code is shown in the above post.

2- Another is to test with each new selection whether that selection was among the discards and trying again. This can get very inefficient if you start with a large number and come down to the last few numbers. This technique discards used numbers, makes a random selection from among all the numbers in the range again, and tests the used numbers before keeping the selection.

3- Yet another way might be to make a random selection anywhere within an array of sequential numbers (like pulling numbers out of a hat), then removing that value from the array and moving down all the remaining values past that index to eliminate the gap. The last array index would then be emptied and eliminated by decrementing the range with each new selection. This technique is the opposite of #2 -- it discards used numbers and makes new selections from the remaining unused numbers until only one is left.


I decided to come up with code to implement the third method, which I don't believe has been done before.


This program simulates the action of pulling sequentially numbered tickets randomly out of a hat. The result is a set of unsorted unique numbers without duplicates within the range.

My usual technique is to use the shuffle sort algorithm, which is the most efficient. Some people use the least efficient method of testing the discards to make sure it doesn't duplicate any of the selected numbers. This algorithm is the opposite. It discards selected numbers and picks a new number from the ones remaining. It works by selecting a random index which will contain the selected number in the array, then moves the array values down to overwrite the selected index, and decreases the range with each repeat.

This version of the program allows repeats of a user-defined range, and also allows a new range to be defined, using three nested DO loops. Here is the code:

====================================================

'Set of unique random numbers in a range without duplicates
'Simulate pulling number out of a hat; remove selected number from mix
DIM range AS INTEGER, ran AS INTEGER, x AS INTEGER
DIM index AS INTEGER, rng AS STRING, ans AS STRING
RANDOMIZE TIMER
CLS
PRINT "Press any key to repeat, Tab for a new range; Esc to quit..."
DO
    INPUT "Enter a range up to 22:  ", rng$
    ran = CINT(VAL(rng$))
    IF ran > 20 THEN ran = 22
    IF ran < 2 THEN ran = 2
    REDIM arry(ran) AS INTEGER
    PRINT
    DO
        range = ran
        FOR x = 1 TO range      'assign sequential values to array
            arry(x) = x
        NEXT x
        DO
            index = INT(RND * range + 1)    'select random index
            PRINT arry(index);              'selected number
            'discard arry(index); move array down to overwrite selected index
            FOR x = index TO range - 1
                arry(x) = arry(x + 1)
            NEXT x
            range = range - 1       'decrease array range after discard
        LOOP UNTIL range = 0        'no more numbers in hat
        PRINT : PRINT
        ans$ = INPUT$(1)
    LOOP UNTIL ans$ = CHR$(27) OR ans$ = CHR$(9)
LOOP UNTIL ans$ = CHR$(27)
SYSTEM




 
 Respond to this message   
Solitaire
(Login Solitaire1)
 - 
24.90.213.75

Improved version of Random Select & Discard algorithm

March 23 2008, 4:23 PM 

Instead of printing each selected number as it comes up and discarding it, this version saves the number to the end of the array, which is not used in the next repeat as the range is decremented. When the range gets down to zero, the random selected numbers are now in the array in the reverse order they were selected. This order can then be printed out all at once in another FOR loop. I also improved the user interface.


===================================================================
'Set of unique random numbers in a range without duplicates
'Simulate pulling number out of a hat; remove selected number from mix
'This version saves selected numbers to end of array in random order
'Saved numbers keep moving down as range is diminished with each repeat
DIM range AS INTEGER, ran AS INTEGER, x AS INTEGER, selnum AS INTEGER
DIM index AS INTEGER, rng AS STRING, ans AS STRING
DIM man AS STRING, many AS INTEGER
RANDOMIZE TIMER
DO
    CLS
    PRINT "Press any key to repeat, Tab for a new range; Esc to quit..."
    PRINT : INPUT ; "Enter a range up to 22:  ", rng$
    ran = CINT(VAL(rng$))
    IF ran > 20 THEN ran = 22 : LOCATE , 26 : PRINT "22     ";
    IF ran <= 0 THEN ran = 12 : LOCATE , 26 : PRINT "12     ";
    PRINT : INPUT ; "How many selections? ", man$;
    many = CINT(VAL(man$))
    IF many <= 0 OR many > ran THEN many = ran : LOCATE , 22 : PRINT ran; "  ";
    REDIM arry(ran) AS INTEGER
    DO
        range = ran
        FOR x = 1 TO range      'assign sequential values to array
            arry(x) = x
        NEXT x
        DO
            index = INT(RND * range + 1)    'select random index
            selnum = arry(index)            'selected number
            'move array down to overwrite arry(index)
            FOR x = index TO range - 1
                arry(x) = arry(x + 1)
            NEXT x
            arry(range) = selnum    'save selected number to end of array
            range = range - 1       'decrease array range after selection
        LOOP UNTIL range = 0        'no more numbers in hat
        PRINT : PRINT
        FOR x = 1 TO many            'print list of random numbers
            PRINT arry(x);
        NEXT x
        ans$ = INPUT$(1)
    LOOP UNTIL ans$ = CHR$(27) OR ans$ = CHR$(9)
LOOP UNTIL ans$ = CHR$(27)
SYSTEM

====================================================================
Note: This version saves the selected numbers in reverse order of selection, as it adds that number to the end of the current range, which diminishes with each repeat. In order to save selection in same order, you must add number to end of the full array and move the entire array down with each repetition. This would be less efficient than the above version, which requires fewer repetitions as the array diminishes.

To alter the program to use the full array, simply replace these two lines:

FOR x = index TO range - 1

to: FOR x = index TO ran - 1


and

arry(range) = selnum

to: arry(ran) = selnum

However, it really makes no difference in which order the selections are presented, since they are still random and unique.


    
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 26, 2008 7:43 AM
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 24, 2008 7:24 PM
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 24, 2008 7:23 PM
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 24, 2008 7:17 PM
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 24, 2008 6:54 PM
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 23, 2008 4:50 PM
This message has been edited by Solitaire1 from IP address 24.90.213.75 on Mar 23, 2008 4:28 PM


 
 Respond to this message   
Solitaire
(Login Solitaire1)
 - 
24.90.213.75

Compare this with the Shuffle Sort

March 30 2008, 11:31 PM 

which uses the swap routine and is really the most efficient way to come up with a complete set of unique random numbers without duplicates.

I wanted to emulate removing the selected random numbers, as if pulling a ticket out of a hat until there was none left. I could also have put the selected numbers back into another array and diminished the original array (with a variation of ReDim Preserve, which isn't available in QB 1.1 or 4.5).

With the Shuffle Sort, all numbers need to be swapped even if you only need to select a few out of the range. However, by using the Select & Discard algorithm, you could stop after just a few selections and leave the remaining array as it was, reducing the number of repeats. The original version does that by simply printing the selected number as it comes up, without saving it back to the end of the array.

The whole point of this exercise is just to explore the various possibilities.

 
 Respond to this message   
MONTREALER
(no login)
70.49.200.215

For Example

April 9 2008, 4:05 PM 

This creates a reverse order for the picks, at the end of the array.
But since you know ahead of time how many selections you are making,
you could use the pointer variable accordingly, and move the selections at the start of the array.

=============================================================================

picks% = 10
total% = 40
toppointer% = total%
DIM numbers%(total%)

FOR l% = 1 TO total%: numbers%(l%) = l%: NEXT

PRINT
PRINT
RANDOMIZE TIMER

FOR l% = 1 TO picks%
choice% = INT(RND * toppointer%) + 1
PRINT TAB(6); numbers%(choice%); TAB(10); "goes to position:"; toppointer%
SWAP numbers%(choice%), numbers%(toppointer%)
toppointer% = toppointer% - 1
NEXT

PRINT
PRINT

PRINT "Numbers selected, in sequence:"
FOR l% = total% TO toppointer% + 1 STEP -1
PRINT numbers%(l%);
NEXT

PRINT
PRINT

PRINT "Numbers not selected:"
FOR l% = 1 TO toppointer%
PRINT numbers%(l%);
NEXT

 
 Respond to this message   
Solitaire
(Login Solitaire1)
 - 
74.73.58.57

Interesting variation of partial shuffle sort

April 9 2008, 7:39 PM 

I revised Montrealer's code a bit and added comments:


'by Montrealer  (cleaned up & variables renamed)
DIM picks AS INTEGER, range AS INTEGER, last AS INTEGER
DIM index AS INTEGER, x AS INTEGER
CLS
RANDOMIZE TIMER
range = 40                              'entire range
picks = 10                              'number of selections
DIM numbers(range) AS INTEGER

FOR x = 1 TO range                      'assign sequential values to array
    numbers(x) = x
NEXT x

PRINT "Partial Shuffle Sort:"
PRINT
last = range                            'range will be temporarily diminished
FOR x = 1 TO picks                      'only shuffles up to the number of selections
    index = INT(RND * last) + 1         'random index up to diminished range
    PRINT TAB(10); numbers(index), "goes to position:"; last
    SWAP numbers(index), numbers(last)  'swaps value with end of diminished range
    last = last - 1
NEXT x

PRINT : PRINT "Numbers selected, in sequence:"
FOR x = range TO last + 1 STEP -1
    PRINT numbers(x);
NEXT x

PRINT : PRINT
PRINT "Numbers not selected:"
FOR x = 1 TO last
    PRINT numbers(x);
NEXT x
SYSTEM


 
 Respond to this message   
MONTREALER
(no login)
65.92.177.163

comparison

April 10 2008, 8:25 AM 

The strongest argument for using this method, rather than the traditional

- generate an element
- check if it was previously generated
- add to the selected elements if not previously chosen
- repeat until done

is that this method closer simulated the 'real world' application of

- pick an element at random from a pool of elements
- remove it from the pool
- repeat until done

The traditional method works just as well, of course, but only simulates the application, while the 'remove it from the pool' method is just more human-logical, rather than computer-logical.

In the end, when revising an old program, it's so much easier to figure out exactly what is going on, when it's doing something that's oriented to human-thinking, rather than computer-oriented conception.

 
 Respond to this message   
Solitaire
(Login Solitaire1)
 - 
74.73.58.57

Variation of Shuffle Sort algorithm

April 10 2008, 8:00 PM 

I rearranged the sort so it saves the selected number in ascending order of the array index starting from the beginning, instead of in reverse which started from the end.

You may notice that since only the numbers up to "picks" is shuffled, the numbers that weren't selected are mostly still in sequential order, except for the randomly selected and swapped indexes. In any case, there are no duplicates and every number is included in the entire range. When a small number of selections is needed among a large range, this modified algorithm is more efficient than the original Shuffle Sort, which repeats as many times as the entire range.

Actually, we've come full circle. The only real difference between this modified version and the original version of the Shuffle Sort is that the second FOR loop only goes up to the number of selections ("picks") instead of the full range. The demo version started by Montrealer shows how the mechanism works step by step.

===============================================================================
'Shuffle Sort originally posted by Solitaire.
'Demonstration version started by Montrealer  (Modified and revised by Solitaire)
'Selections are now placed in numbered order of array indexes, not in reverse order
'Note: Same index may be selected randomly more than once, but after being
'      swapped, it will now contain a different value.

DIM picks AS INTEGER, range AS INTEGER
DIM index AS INTEGER, x AS INTEGER
CLS
RANDOMIZE TIMER
range = 40                              'entire range
picks = 10                              'number of selections
DIM numbers(range) AS INTEGER

FOR x = 1 TO range                      'assign sequential values to array
    numbers(x) = x
NEXT x

PRINT "Partial Shuffle Sort:"
PRINT
FOR x = 1 TO picks                      'only shuffles up to the number of selections
    index = INT(RND * range) + 1        'value at random index will be swapped
    PRINT TAB(10); numbers(index), "goes to position:"; x
    SWAP numbers(index), numbers(x)     'swaps value in order of array index
NEXT x

PRINT : PRINT "Numbers selected, in sequence:"
FOR x = 1 TO picks
    PRINT numbers(x);
NEXT x

PRINT : PRINT : PRINT "Numbers not selected:"
FOR x = picks + 1 TO range
    PRINT numbers(x);
NEXT x
SYSTEM



    
This message has been edited by Solitaire1 from IP address 74.73.58.57 on Apr 10, 2008 9:28 PM
This message has been edited by Solitaire1 from IP address 74.73.58.57 on Apr 10, 2008 9:18 PM
This message has been edited by Solitaire1 from IP address 74.73.58.57 on Apr 10, 2008 8:14 PM


 
 Respond to this message   
MONTREALER
(no login)
70.48.99.244

Did you notice...

April 11 2008, 8:25 AM 

That 'IF - THEN' is not used in this program ?

 
 Respond to this message   
Solitaire
(Login Solitaire1)
 - 
74.73.58.57

Notes on the modified shuffle sort:

April 11 2008, 11:37 AM 

The advantages of a shortened FOR block for the number of shuffle repetitions is negligible unless you have a range of several thousand and only a few selections. Suppose you suddenly have one more "prize" to award to a random winner? Then you will need to add something like this on to the code (using an IF block if one more pick is required):

picks = picks + 1
index = INT(RND * range) + 1
SWAP numbers(indx), numbers(picks)

or you could complete the remaining selections by including another FOR loop going to the range:

FOR x = picks + 1 TO range
   'same code as in above post

or having the user enter the number of additional selections needed:

FOR x = picks + 1 TO picks + addon
'same code as in above post

It may make more sense to simply include a larger value to "picks" in the first place rather than resorting to those patches. For a small range, I would shuffle the entire range. For a very large range, I would add on a proportionate percentage of extra "picks" to the shuffle sort FOR loop. (That would require an IF block. You would also need to select an arbitrary percentage to use and come up with the formula for applying it.)



    
This message has been edited by Solitaire1 from IP address 74.73.58.57 on Apr 11, 2008 11:23 PM


 
 Respond to this message   
qbguy
(no login)
75.9.218.72

Randomly Permute an Array

March 23 2008, 1:13 PM 

DECLARE SUB SHUFFLE (A!())
DIM A(1 TO 100)
FOR I = 1 TO 100
A(I) = I
NEXT
CALL SHUFFLE(A())
FOR I = 1 TO 100
PRINT A(I);
NEXT

SUB SHUFFLE (A())
FOR n = UBOUND(A) TO LBOUND(A) + 1 STEP -1
k = INT(RND * (UBOUND(A) - LBOUND(A))) + LBOUND(A)
SWAP A(n - 1), A(k)
NEXT
END SUB

 
 Respond to this message   
Current Topic - Shuffle Sort algorithm - unique random numbers with no duplicates
  << Previous Topic | Next Topic >>Return to Index