Create a function with the following specifications:
*It checks for the existence of any of a set of strings inside one position of a string.
*It takes this form:
INSTRMULTI%(startlocation%, stringarray$(), stringtosearch$)
1) startlocation% is the starting search location inside stringtosearch$.
2) stringarray$() are the strings to search.
3) Two values result:
INSTRMULTI%, the search location.
stringindexfound%, the index position of the first string found. Strings should be sorted alphabetically, A-Z, and on size, smallest first.
Optional:
Make stringindexfound%() an array, so that if there is more than one string found at the first location, the array gives all the index values of all the strings found.
*********
Important
*********
This wouldn't be much of a challenge if you only used INSTR and compared the results inside the function. That's inefficient. The function should not search the same character in a string twice or more!
*******
Prizes
*******
Satisfaction that you created a good time-saving function...
It's a FB challenge, is'nt?
For QB, the trivial case takes less than 1/18th second in my pc...In QB all the strings must fit into the 64K of DGROUP and can't have more than 32700 chars.
Code:
'by Antoni , for 1500 strings searched in a 31000 chars buffer
DECLARE FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
randomize timer
CONST bigslen = 31000
const arrsz=1500
dim i%,a%,b%,bigs$
bigs$ = SPACE$(bigslen)
REDIM ss(1 to arrsz) AS STRING
FOR i = 1 TO bigslen
MID$(bigs$, i, 1) = CHR$(RND * 96+32)
NEXT
PRINT "string built"
a% = bigslen - 10
FOR i% = 1 TO arrsz
ss(i%) = MID$(bigs$, RND * a%, rnd*15+4)
NEXT
PRINT "search strings built"
t!=timer
a% = instrmulti%(1, ss$(), bigs$, b%)
IF a% > 0 THEN
PRINT "First match found at index ";a%;" for string nr ";b%," : " ; ss(b%);" = "; MID$(bigs$, a%,LEN(ss$(b%)))
else
print "Search string not found"
eND IF
print "in"; timer-t!;"seconds"
sleep
END
FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
dim i%,a%,b%,a1%
a% = LEN(bigs$)
FOR i% = LBOUND(ss, 1) TO UBOUND(ss, 1)
a1% = INSTR(bigs$,ss(i%))
IF (a1% > 0) AND (a1% < a%) THEN a% = a1% :strindxfnd% = i%:
'print i%,a1%,a%
NEXT
IF a% = LEN(bigs$) THEN a% = 0
instrmulti% = a%
END FUNCTION
Second finding:
Bruteforce in FB is fast too. A 3,1 Mb string, searching for 1500 strings of 2000 to 2200 chars requires 1,4 seconds. FB implements INSTR using Boyer-Moore search...
Can someone do it faster? :???:
Code:
DECLARE FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
randomize timer
CONST bigslen = 3100000
const arrsz=1500
dim i%,a%,b%,bigs$
bigs$ = SPACE$(bigslen)
REDIM ss(1 to arrsz) AS STRING
FOR i = 1 TO bigslen
MID$(bigs$, i, 1) = CHR$(RND * 96+32)
NEXT
PRINT "string built"
a% = bigslen - 10
FOR i% = 1 TO arrsz
a%=rnd*200+2000
ss(i%) = MID$(bigs$, RND*(bigslen-a%), a%)
NEXT
PRINT "search strings built"
t!=timer
a% = instrmulti%(1, ss$(), bigs$, b%)
IF a% > 0 THEN
PRINT "First match found at index ";a%;" for string nr ";b%," : " ; ss(b%);" = "; MID$(bigs$, a%,LEN(ss$(b%)))
else
print "Search string not found"
eND IF
print "in"; timer-t!;"seconds"
sleep
END
FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
dim i%,a%,b%,a1%
a% = LEN(bigs$)
FOR i% = LBOUND(ss, 1) TO UBOUND(ss, 1)
a1% = INSTR(bigs$,ss(i%))
IF (a1% > 0) AND (a1% < a%) THEN a% = a1% :strindxfnd% = i%:
'print i%,a1%,a%
NEXT
IF a% = LEN(bigs$) THEN a% = 0
instrmulti% = a%
END FUNCTION
Yeah, an FB test would work better. Consider that you might want to search each string several times..
Brute force is ok, but I had in mind this:
Change the strings like so:
apple
pear
mud
apm
peu
pad
pr
l
e
Now search for the Nth character of the big string inside each of these new strings. N starts at 1. If you find anything, add N and search with the next string. Keep doing this until you reach the end of one of them. Abort if string positions are switched mid-way. (ie, you find mud but not the "app" part of apple)
Considerations:
*Input strings must be sorted biggest or smallest first, and alphabetically as well. It needs to be alphabetically sorted because of problems with false aborts, I think. (may not be the case, would need to be tested)
*The new string would need to be preserved somehow.
You provided all the ideas...where is the challenge?