Posts: 3,368
Threads: 195
Joined: Jan 2003
Create a function with the following specifications:
*It checks for the existence of any of a set of strings inside one position of a string.
*It takes this form:
INSTRMULTI%(startlocation%, stringarray$(), stringtosearch$)
1) startlocation% is the starting search location inside stringtosearch$.
2) stringarray$() are the strings to search.
3) Two values result:
INSTRMULTI%, the search location.
stringindexfound%, the index position of the first string found. Strings should be sorted alphabetically, A-Z, and on size, smallest first.
Optional:
Make stringindexfound%() an array, so that if there is more than one string found at the first location, the array gives all the index values of all the strings found.
*********
Important
*********
This wouldn't be much of a challenge if you only used INSTR and compared the results inside the function. That's inefficient. The function should not search the same character in a string twice or more!
*******
Prizes
*******
Satisfaction that you created a good time-saving function...
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."
Visit www.neobasic.net to see rubbish in all its finest.
Posts: 1,407
Threads: 117
Joined: Dec 2002
It's a FB challenge, is'nt?
For QB, the trivial case takes less than 1/18th second in my pc...In QB all the strings must fit into the 64K of DGROUP and can't have more than 32700 chars.
Code: 'by Antoni , for 1500 strings searched in a 31000 chars buffer
DECLARE FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
randomize timer
CONST bigslen = 31000
const arrsz=1500
dim i%,a%,b%,bigs$
bigs$ = SPACE$(bigslen)
REDIM ss(1 to arrsz) AS STRING
FOR i = 1 TO bigslen
MID$(bigs$, i, 1) = CHR$(RND * 96+32)
NEXT
PRINT "string built"
a% = bigslen - 10
FOR i% = 1 TO arrsz
ss(i%) = MID$(bigs$, RND * a%, rnd*15+4)
NEXT
PRINT "search strings built"
t!=timer
a% = instrmulti%(1, ss$(), bigs$, b%)
IF a% > 0 THEN
PRINT "First match found at index ";a%;" for string nr ";b%," : " ; ss(b%);" = "; MID$(bigs$, a%,LEN(ss$(b%)))
else
print "Search string not found"
eND IF
print "in"; timer-t!;"seconds"
sleep
END
FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
dim i%,a%,b%,a1%
a% = LEN(bigs$)
FOR i% = LBOUND(ss, 1) TO UBOUND(ss, 1)
a1% = INSTR(bigs$,ss(i%))
IF (a1% > 0) AND (a1% < a%) THEN a% = a1% :strindxfnd% = i%:
'print i%,a1%,a%
NEXT
IF a% = LEN(bigs$) THEN a% = 0
instrmulti% = a%
END FUNCTION
Antoni
Posts: 1,407
Threads: 117
Joined: Dec 2002
Second finding:
Bruteforce in FB is fast too. A 3,1 Mb string, searching for 1500 strings of 2000 to 2200 chars requires 1,4 seconds. FB implements INSTR using Boyer-Moore search...
Can someone do it faster? :???:
Code: DECLARE FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
randomize timer
CONST bigslen = 3100000
const arrsz=1500
dim i%,a%,b%,bigs$
bigs$ = SPACE$(bigslen)
REDIM ss(1 to arrsz) AS STRING
FOR i = 1 TO bigslen
MID$(bigs$, i, 1) = CHR$(RND * 96+32)
NEXT
PRINT "string built"
a% = bigslen - 10
FOR i% = 1 TO arrsz
a%=rnd*200+2000
ss(i%) = MID$(bigs$, RND*(bigslen-a%), a%)
NEXT
PRINT "search strings built"
t!=timer
a% = instrmulti%(1, ss$(), bigs$, b%)
IF a% > 0 THEN
PRINT "First match found at index ";a%;" for string nr ";b%," : " ; ss(b%);" = "; MID$(bigs$, a%,LEN(ss$(b%)))
else
print "Search string not found"
eND IF
print "in"; timer-t!;"seconds"
sleep
END
FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
dim i%,a%,b%,a1%
a% = LEN(bigs$)
FOR i% = LBOUND(ss, 1) TO UBOUND(ss, 1)
a1% = INSTR(bigs$,ss(i%))
IF (a1% > 0) AND (a1% < a%) THEN a% = a1% :strindxfnd% = i%:
'print i%,a1%,a%
NEXT
IF a% = LEN(bigs$) THEN a% = 0
instrmulti% = a%
END FUNCTION
Antoni
Posts: 3,368
Threads: 195
Joined: Jan 2003
Yeah, an FB test would work better. Consider that you might want to search each string several times..
Brute force is ok, but I had in mind this:
Change the strings like so:
apple
pear
mud
apm
peu
pad
pr
l
e
Now search for the Nth character of the big string inside each of these new strings. N starts at 1. If you find anything, add N and search with the next string. Keep doing this until you reach the end of one of them. Abort if string positions are switched mid-way. (ie, you find mud but not the "app" part of apple)
Considerations:
*Input strings must be sorted biggest or smallest first, and alphabetically as well. It needs to be alphabetically sorted because of problems with false aborts, I think. (may not be the case, would need to be tested)
*The new string would need to be preserved somehow.
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."
Visit www.neobasic.net to see rubbish in all its finest.
Posts: 1,407
Threads: 117
Joined: Dec 2002
You provided all the ideas...where is the challenge?
Antoni
Posts: 3,368
Threads: 195
Joined: Jan 2003
Making it work.
:normal:
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."
Visit www.neobasic.net to see rubbish in all its finest.
|