Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
create a function...
#1
Create a function with the following specifications:

*It checks for the existence of any of a set of strings inside one position of a string.
*It takes this form:

INSTRMULTI%(startlocation%, stringarray$(), stringtosearch$)

1) startlocation% is the starting search location inside stringtosearch$.
2) stringarray$() are the strings to search.
3) Two values result:
INSTRMULTI%, the search location.
stringindexfound%, the index position of the first string found. Strings should be sorted alphabetically, A-Z, and on size, smallest first.

Optional:
Make stringindexfound%() an array, so that if there is more than one string found at the first location, the array gives all the index values of all the strings found.

*********
Important
*********

This wouldn't be much of a challenge if you only used INSTR and compared the results inside the function. That's inefficient. The function should not search the same character in a string twice or more!

*******
Prizes
*******

Satisfaction that you created a good time-saving function...
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#2
It's a FB challenge, is'nt?
For QB, the trivial case takes less than 1/18th second in my pc...In QB all the strings must fit into the 64K of DGROUP and can't have more than 32700 chars.
Code:
'by Antoni , for 1500 strings searched in a 31000 chars buffer
DECLARE FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
randomize timer
CONST bigslen = 31000
const arrsz=1500
dim i%,a%,b%,bigs$
bigs$ = SPACE$(bigslen)
REDIM ss(1 to arrsz) AS STRING

FOR i = 1 TO bigslen
MID$(bigs$, i, 1) = CHR$(RND * 96+32)
NEXT
PRINT "string built"
a% = bigslen - 10
FOR i% = 1 TO arrsz
   ss(i%) = MID$(bigs$, RND * a%, rnd*15+4)
NEXT
PRINT "search strings built"

t!=timer
a% = instrmulti%(1, ss$(), bigs$, b%)
IF a% > 0 THEN
  PRINT "First match found at index ";a%;" for string nr ";b%," : " ; ss(b%);" = "; MID$(bigs$, a%,LEN(ss$(b%)))
else
  print "Search string not found"
eND IF
print "in"; timer-t!;"seconds"
sleep
END

FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
dim i%,a%,b%,a1%
a% = LEN(bigs$)
FOR i% = LBOUND(ss, 1) TO UBOUND(ss, 1)
   a1% = INSTR(bigs$,ss(i%))
   IF (a1% > 0) AND (a1% < a%) THEN a% = a1% :strindxfnd% = i%:
   'print i%,a1%,a%
NEXT
IF a% = LEN(bigs$) THEN a% = 0
instrmulti% = a%
END FUNCTION
Antoni
Reply
#3
Second finding:
Bruteforce in FB is fast too. A 3,1 Mb string, searching for 1500 strings of 2000 to 2200 chars requires 1,4 seconds. FB implements INSTR using Boyer-Moore search...

Can someone do it faster? :???:

Code:
DECLARE FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
randomize timer
CONST bigslen = 3100000
const arrsz=1500
dim i%,a%,b%,bigs$
bigs$ = SPACE$(bigslen)
REDIM ss(1 to arrsz) AS STRING

FOR i = 1 TO bigslen
MID$(bigs$, i, 1) = CHR$(RND * 96+32)
NEXT
PRINT "string built"
a% = bigslen - 10
FOR i% = 1 TO arrsz
   a%=rnd*200+2000
   ss(i%) = MID$(bigs$, RND*(bigslen-a%), a%)
NEXT
PRINT "search strings built"

t!=timer
a% = instrmulti%(1, ss$(), bigs$, b%)
IF a% > 0 THEN
  PRINT "First match found at index ";a%;" for string nr ";b%," : " ; ss(b%);" = "; MID$(bigs$, a%,LEN(ss$(b%)))
else
  print "Search string not found"
eND IF
print "in"; timer-t!;"seconds"
sleep
END

FUNCTION instrmulti% (startloc%, ss() AS STRING, bigs$, strindxfnd%)
dim i%,a%,b%,a1%
a% = LEN(bigs$)
FOR i% = LBOUND(ss, 1) TO UBOUND(ss, 1)
   a1% = INSTR(bigs$,ss(i%))
   IF (a1% > 0) AND (a1% < a%) THEN a% = a1% :strindxfnd% = i%:
   'print i%,a1%,a%
NEXT
IF a% = LEN(bigs$) THEN a% = 0
instrmulti% = a%
END FUNCTION
Antoni
Reply
#4
Yeah, an FB test would work better. Consider that you might want to search each string several times..

Brute force is ok, but I had in mind this:
Change the strings like so:
apple
pear
mud

apm
peu
pad
pr
l
e

Now search for the Nth character of the big string inside each of these new strings. N starts at 1. If you find anything, add N and search with the next string. Keep doing this until you reach the end of one of them. Abort if string positions are switched mid-way. (ie, you find mud but not the "app" part of apple)

Considerations:
*Input strings must be sorted biggest or smallest first, and alphabetically as well. It needs to be alphabetically sorted because of problems with false aborts, I think. (may not be the case, would need to be tested)
*The new string would need to be preserved somehow.
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply
#5
You provided all the ideas...where is the challenge? Big Grin
Antoni
Reply
#6
Making it work.

:normal:
Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)