10-12-2005, 05:44 AM
I'm not sure if this method has been employed before, but it occurred to me as an easy way to do encryption in QBasic. Suppose you want to encrypt a text file, and for ease of explanation, let's assume that the text is all lower case alphanumerics. First, break the plaintext up into groups of 5 characters. Now instead of looking at each group of 5 as text, you interpret the group as a 5 digit number with a base or radix of 37. I say 37 because our character set has 37 members: 0123456789_abcdefghijklmnopqrstuvwxyz. (I'm substituting an underscore for a space.)
If you are familiar with hexadecimal numbers, you know that A-F represent the numbers 10-15. If we continue through the character set above, "z" would represent the number 36, 37 would be "10" (the second place from the right is 37th's place), and 100(base 10) would be 2p(base 37).
Now suppose that instead of converting each 5 character group to base 10, we convert it from base 37 to base 38. In order to do so, we need one more character to the right of "z". I used the vertical bar (|) for this purpose in the demo program listed below. The program lets you encode and decode text strings entered at the keyboard. It ignores keypresses not in the above listed character set (except the space bar is OK to use).
Using this method, "qbasic" encrypts to "njk_fc", "aaaaa" encrypts to "_37da", and the phrase "train tracks track trains" becomes "qmtzh9k4z_ibi409vit0o0__2". Decoding returns the original strings.
This is symmetric encryption. The decryption key is simply the character set. I used it in sequence here (012...xyz), but in practice it can be scrambled in any way as long as users at both ends have it.
There are a few refinements in the demo program:
1.) I insert ~ as the leftmost (0 value) character. This eliminates the problem of leading zeros in the plaintext since ~ is not in the plaintext. So with this added character we're actually converting base 38 to base 39.
2.) Infrequently a 5 digit number will encode to a 4 digit result. This would mess up the decoding process, so I pad the left side with ~ if the result is less than 5 digits.
3.) If the same 5 digit string appears more than once in the plaintext, we wouldn't want it to be encoded the same way. To avoid this, I rotate the character set one place to the right after each conversion.
Expanding the demo program into a useable one might include:
-- Increase the size of the character set to include punctuation, etc. I believe the practical limit is 73. Beyond that the numbers get too big for QBasic.
-- Have the program read and write to files instead of getting input from the keyboard (a must).
-- Use a password routine to scramble the character set. A set with 71 characters would have over 8*10^101 permutations.
I really don't know what constitutes strong encryption, and I don't claim that this is anything of the sort. I just think that it's a relatively easy way to encrypt text messages and an interesting diversion for those who like numbers and programming.
DEFLNG T
CLS
DO
PRINT
rx$ = "0123456789_abcdefghijklmnopqrstuvwxyz"
ans$ = "": dnr$ = "": t = 0
PRINT "Choose: encode(e), decode(d), quit(q)"
DO
i$ = LCASE$(INKEY$)
IF i$ = "q" THEN END
LOOP WHILE i$ <> "e" AND i$ <> "d"
IF i$ = "e" THEN
bai = 38: bao = 39
INPUT "Text to encode"; dn$
dn$ = LCASE$(dn$)
FOR p = 1 TO LEN(dn$)
p$ = MID$(dn$, p, 1)
IF p$ = CHR$(32) THEN p$ = CHR$(95)
IF INSTR(rx$, p$) THEN dnr$ = dnr$ + p$
NEXT p
END IF
IF i$ = "d" THEN
bai = 39: bao = 38
INPUT "Text to decode"; dnr$
END IF
FOR p = 1 TO LEN(dnr$) STEP 5
f$ = MID$(dnr$, p, 5)
GOSUB convtodec
GOSUB dectox
rx$ = RIGHT$(rx$, 1) + LEFT$(rx$, 36)
NEXT p
PRINT dnr$ + " = "; ans$
LOOP
REM convtodec converts the input fragment (f$) to base 10 (t)
convtodec:
IF i$ = "d" THEN rxe$ = "~" + rx$ + "|" ELSE rxe$ = "~" + rx$
FOR a = LEN(f$) TO 1 STEP -1
k = LEN(f$) - a
b$ = MID$(f$, a, 1)
c = INSTR(rxe$, b$)
IF c = 0 THEN PRINT : PRINT b$; " is not in character set!": END
t = t + (bai ^ k) * (c - 1)
NEXT
RETURN
REM dectox converts the base 10 number to the output base
dectox:
IF i$ = "e" THEN rxd$ = "~" + rx$ + "|" ELSE rxd$ = "~" + rx$
n = FIX(LOG(t) / LOG(bao))
FOR a = n TO 0 STEP -1
index = INT(t / bao ^ a) + 1
ans$ = ans$ + MID$(rxd$, index, 1)
t = t MOD (bao ^ a)
NEXT a
IF LEN(ans$) < 5 THEN ans$ = STRING$(5 - LEN(ans$), "~") + ans$
RETURN
If you are familiar with hexadecimal numbers, you know that A-F represent the numbers 10-15. If we continue through the character set above, "z" would represent the number 36, 37 would be "10" (the second place from the right is 37th's place), and 100(base 10) would be 2p(base 37).
Now suppose that instead of converting each 5 character group to base 10, we convert it from base 37 to base 38. In order to do so, we need one more character to the right of "z". I used the vertical bar (|) for this purpose in the demo program listed below. The program lets you encode and decode text strings entered at the keyboard. It ignores keypresses not in the above listed character set (except the space bar is OK to use).
Using this method, "qbasic" encrypts to "njk_fc", "aaaaa" encrypts to "_37da", and the phrase "train tracks track trains" becomes "qmtzh9k4z_ibi409vit0o0__2". Decoding returns the original strings.
This is symmetric encryption. The decryption key is simply the character set. I used it in sequence here (012...xyz), but in practice it can be scrambled in any way as long as users at both ends have it.
There are a few refinements in the demo program:
1.) I insert ~ as the leftmost (0 value) character. This eliminates the problem of leading zeros in the plaintext since ~ is not in the plaintext. So with this added character we're actually converting base 38 to base 39.
2.) Infrequently a 5 digit number will encode to a 4 digit result. This would mess up the decoding process, so I pad the left side with ~ if the result is less than 5 digits.
3.) If the same 5 digit string appears more than once in the plaintext, we wouldn't want it to be encoded the same way. To avoid this, I rotate the character set one place to the right after each conversion.
Expanding the demo program into a useable one might include:
-- Increase the size of the character set to include punctuation, etc. I believe the practical limit is 73. Beyond that the numbers get too big for QBasic.
-- Have the program read and write to files instead of getting input from the keyboard (a must).
-- Use a password routine to scramble the character set. A set with 71 characters would have over 8*10^101 permutations.
I really don't know what constitutes strong encryption, and I don't claim that this is anything of the sort. I just think that it's a relatively easy way to encrypt text messages and an interesting diversion for those who like numbers and programming.
DEFLNG T
CLS
DO
rx$ = "0123456789_abcdefghijklmnopqrstuvwxyz"
ans$ = "": dnr$ = "": t = 0
PRINT "Choose: encode(e), decode(d), quit(q)"
DO
i$ = LCASE$(INKEY$)
IF i$ = "q" THEN END
LOOP WHILE i$ <> "e" AND i$ <> "d"
IF i$ = "e" THEN
bai = 38: bao = 39
INPUT "Text to encode"; dn$
dn$ = LCASE$(dn$)
FOR p = 1 TO LEN(dn$)
p$ = MID$(dn$, p, 1)
IF p$ = CHR$(32) THEN p$ = CHR$(95)
IF INSTR(rx$, p$) THEN dnr$ = dnr$ + p$
NEXT p
END IF
IF i$ = "d" THEN
bai = 39: bao = 38
INPUT "Text to decode"; dnr$
END IF
FOR p = 1 TO LEN(dnr$) STEP 5
f$ = MID$(dnr$, p, 5)
GOSUB convtodec
GOSUB dectox
rx$ = RIGHT$(rx$, 1) + LEFT$(rx$, 36)
NEXT p
PRINT dnr$ + " = "; ans$
LOOP
REM convtodec converts the input fragment (f$) to base 10 (t)
convtodec:
IF i$ = "d" THEN rxe$ = "~" + rx$ + "|" ELSE rxe$ = "~" + rx$
FOR a = LEN(f$) TO 1 STEP -1
k = LEN(f$) - a
b$ = MID$(f$, a, 1)
c = INSTR(rxe$, b$)
IF c = 0 THEN PRINT : PRINT b$; " is not in character set!": END
t = t + (bai ^ k) * (c - 1)
NEXT
RETURN
REM dectox converts the base 10 number to the output base
dectox:
IF i$ = "e" THEN rxd$ = "~" + rx$ + "|" ELSE rxd$ = "~" + rx$
n = FIX(LOG(t) / LOG(bao))
FOR a = n TO 0 STEP -1
index = INT(t / bao ^ a) + 1
ans$ = ans$ + MID$(rxd$, index, 1)
t = t MOD (bao ^ a)
NEXT a
IF LEN(ans$) < 5 THEN ans$ = STRING$(5 - LEN(ans$), "~") + ans$
RETURN