Challenge: 3-way merge: - Printable Version +- Qbasicnews.com (http://qbasicnews.com/newforum) +-- Forum: QbasicNews.Com (http://qbasicnews.com/newforum/forum-3.html) +--- Forum: Challenges (http://qbasicnews.com/newforum/forum-10.html) +--- Thread: Challenge: 3-way merge: (/thread-1400.html) |
Challenge: 3-way merge: - Moneo - 07-04-2003 3-WAY MERGE PROGRAM PURPOSE: Write a program to merge 3 input text files, by key, and generate a merged output text file. GIVEN: * There are 3 input text files: AAA, BBB, and CCC. * Let's call the output file DDD. * The 3 files are always present and contain at least 1 record. * Each file should be in sequence within itself. * The records are variable length, minimum 4 bytes, maximum 100. * The key is in the first 4 bytes of each record. * The key is strictly numeric, unsigned with leading zeros. * The key can have values from 0000 to 9999. * Records with duplicate keys can occur. * Therefore, the output file could have over 30,000 records. TIP: In order not to have to check the sequence of each input file, just check the sequence of the output file. If it's out of sequence, then (1) one of the input files was out of sequence, or (2) your program is not merging correctly. NOTE: This is a problem that has appeared on programming aptitude tests. It is not as simple as it seems. I suggest you do a flowchart first. It wil take about 70-100 lines of code, without comments. Please debug your code before posting. I will prepare some sample input files on my machine for testing your posted programs. Good luck! ***** Challenge: 3-way merge: - Antoni Gual - 07-05-2003 1.-The records are variable length? How are they separed? CR+LF? 2.-We can suppose the records in the input files have been sorted previously? Challenge: 3-way merge: - Moneo - 07-05-2003 Yes, Antoni, the records end with CR LF. You can "suppose" that the records on each input file have been sorted, but you should sequence check them anyway. I ALWAYS check it to make sure. Otherwise, like I said in some other post, my program looks like it failed when the output comes out unsorted. In this particular program, I would only check the sequence of the output. ***** Challenge: 3-way merge: - Moneo - 07-09-2003 Well it looks like nobody jumped in with a solution. You probably figured out that this is a tricky program. Let me give you a few tips that really simplify the logic of this program: COMPARE CONTROL INDICATOR: ---------------------------------------- I call this NEXTLOW. This is a integer whose bits indicate the results of comparing the keys of the 3 files (AAA,BBB,CCC). Bit 1 of NEXTLOW corresponds to file AAA. Bit 2 of NEXTLOW corresponds to file BBB. Bit 4 of NEXTLOW corresponds to file CCC. As a result of the compares, you OR the corresponding bit or bits into NEXTLOW based on which file key or keys are low. Then: NEXTLOW is 1 when key of file AAA compared the lowest. NEXTLOW is 2 when key of file BBB compared the lowest. NEXTLOW is 3 when keys of AAA & BBB compared the lowest. NEXTLOW is 4 when key of file CCC compared the lowest. NEXTLOW is 5 when keys of AAA & CCC compared the lowest. NEXTLOW is 6 when keys of BBB & CCC compared the lowest. NEXTLOW is 7 when keys of AAA, BBB & CCC compared equal. When you're done with the compares and setting up NEXTLOW, which is about 16 lines of code, the rest of the program is trivial because the NEXTLOW indicator tells you what you have to do. Example: If NEXTLOW IS 5, you have to write the records from file AAA and CCC to the output, and then you have to read new records from AAA and CCC, and go back to the compare logic again. HIGHVALUE CONTROL: --------------------------- Using the concept of "highvalue" save you from setting switches to indicate that particular files have gone to end-of-file. How does this work? When the read for a particular file detects an end-of-file, you stuff a "highvalue" into the key for this file. A highvalue is a string or number that you have determined will be higher that any key of any file. In this sample program, we have a 4 byte key, To be absolutely safe we could set up a highvalue of 5 bytes of hex'FF'. Having done this for all 3 files, then at the end of the compare logic we add one line of code: IF NEXTLOW = 7 AND KEY.OF.AAA = HIGHVALUE THEN GOTO EOJ What has happened is that the HIGHVALUE has flushed all the files. When the keys of all 3 files are sitting at HIGHVALUE, the program is finished --- there's no more records to process. The above concepts of NEXTLOW and HIGHVALUE have been used successfully for over 40 years by industry expert. I didn't just pull them out of my hat. ***** Challenge: 3-way merge: - Antoni Gual - 07-12-2003 A little late... Errors I have considered: Input or output files can't be opened Input files out of order Blank lines files with 0 records .... But ,you know, the errors can come in infinite flavours... Code: '3 Way Merge by Antoni Gual 2003 Challenge: 3-way merge: - Moneo - 07-12-2003 Excellent Antoni. I gave it a good test with all 7 posible matching conditions, and it works 100%. Too bad I can't understand your code, as usual. I'm surprised you didn't manage to use recursion. ***** Challenge: 3-way merge: - Antoni Gual - 07-12-2003 No it' does not use recursion, but it has a GOTO, that's even worse. :evil: Challenge: 3-way merge: - Moneo - 07-13-2003 ANTONI, This is your original code which contains the GOTO. Code: DO Code: DO ***** Challenge: 3-way merge: - Moneo - 07-20-2003 Hey guys, This challenge has been open for 2 weeks and we only had one submitted solution, an excellent piece of code by Antoni, which complies with all the specifications and works 100%. The imlementation of this solution is considered a classic programming problem. You all should take a crack at it to appreciate the subtleties. You will definitely learn by doing. I'm going to establish a deadline of 1 week from today, 26-Jul-2003 23:59 GMT for additional solutions, otherwise Antoni will be the winner. ***** |