Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
C Style commenting
#21
Well, // was introduced with C++, but most C compilers won't complain if you use them. Anyhow it is good practice to use /* and */ if you are coding in C and you are focusing on portability.

Nested comments are not allowed. The C parser is really lazy so it will fail with something like this:

Code:
/* This is a comment
   /* This is  anested comment */
End! */

It will simply ignore the second /*, and when it finds the first */ it will consider that the comment has ended so End! will cause a parse error.

As for the challenge, as Moneo has said it is something completely useless yet very easy to code. A simple FSM will do and you could have it in 10 lines of code or less (I mean only the parsing). I'll be entering, but maybe tomorrow. I'm busy right now Smile
SCUMM (the band) on Myspace!
ComputerEmuzone Games Studio
underBASIC, homegrown musicians
[img]http://www.ojodepez-fanzine.net/almacen/yoghourtslover.png[/i
Reply
#22
Nathan,
Thanks for the good info.
Regarding you nested comments example, the comment remover would probaby remove the first two lines, leaving the END! line. When you compiled the resulting source, you would get an error like you say, and not having the original 2 comment lines to look at, you would probably be confused and blame it on the comment remover program.
*****
Reply
#23
Well, the input C code would not be correct after all, so if the program acts that way it is doing fine. You could always add some kind of warning messages in the remover which would be easy to implement.
SCUMM (the band) on Myspace!
ComputerEmuzone Games Studio
underBASIC, homegrown musicians
[img]http://www.ojodepez-fanzine.net/almacen/yoghourtslover.png[/i
Reply
#24
Nathan,
What's a simple FSM?

BTW, given the scenario in my post above regarding delimeters found within quoted strings, and quotes found within comments, I seriously doubt that you can do the all the parsing required in 10 lines of code. Think about it again.
*****
Reply
#25
You can do it.

A FSM is a Finite State Machine. Once the FSM is hardcoded (in an array, for example) a FSM interpreter can be done in 10 lines of code. You just have to check for the state where you are and the symbol coming (one at a time) and jumping to a new state depending on it.

Note that I said that the FSM interpreter will take 10 lines of code. Of couse you need more lines to check if the input file exists, opening files, and doing misc stuff.
SCUMM (the band) on Myspace!
ComputerEmuzone Games Studio
underBASIC, homegrown musicians
[img]http://www.ojodepez-fanzine.net/almacen/yoghourtslover.png[/i
Reply
#26
Your FSM sounds like a "step sequencer" like computer --- no branching, just set flags.

I've heard the term FSM, but never saw a real life use of it.
*****
Reply
#27
Quote:The first question in my mind is: Why would you want to strip the comments from a C program anyway? Is this a practical requirement or just an exercise?

Just an exercise.

Quote:While scanning over 100 C programs, I couldn't find the // prefix for on-line comments. I only found the leading /* and the trailing */ even if the comment was only on a single line. Are you sure the // option exists on all versions of C?

I didn't know that // wasn't allowed because I've never programmed in C before, as Nek pointed out in an earlier post. I do a lot of PHP programming and // is allowed in PHP, and PHP has (I'm told) a C like syntax, so...

Quote:1) It's possible that /* or */ or // could appear inside of a quoted string, and some quoted strings are delimited by double quotes (") and others by single quotes ('). So every time the program finds one of these supposed comment delimeters, it would have to do several scans to make sure that they were not found within quoted strings, which of course would have to be ignored. This is not as easy as it sounds. Watch out, 'cause the comments themselves could contain either of the quotes or even embedded delimiters themselves.

Well that makes the challenge a little harder then! That way people's entries will vary and I'll be able to judge which program is better.

Quote:2) An absolute "must" requirement is that the C source code has already been successfully compiled. You can appreciate that if the source code has errors, then the comment removing program will go bananas. In any event, the program will have to detect certain fatal errors, like a multi-line comment that has no subsequent */ terminator.

OKay, the programs that I'll test it on will already be able to be successfully compiler, however I may throw a couple of wonky ones at your entries just to see what happens Wink

Quote:3) Another "must" is to successfully compile the new uncommented code and then compare the object code to the original. If they don't match, then the comment remover doesn't work.

That's a way to test whether it's worked, yes.

Quote:Since you mentioned "C style" programs, perhaps the program will have to ask the user up front what comment delimeters he wants to use. Maybe this can be on a little parameter file as input.

I've already stated this feature as being worth bonus points Wink

Quote:In summary, if you have a practical use for this program, I'd be interested in taking a crack at it. If it's just an exercise, I'll pass.
*****

Awww go on... pretend it's a "hypothetical exercise" Wink
Reply
#28
Quote:Your FSM sounds like a "step sequencer" like computer --- no branching, just set flags.

I've heard the term FSM, but never saw a real life use of it.
*****

You have real life uses of it almost in every compiler. A FSM is the better way to go when you are checking for syntax. Basicly you build a grammar which is used to build a FSM. I'm pretty sure you've learned about Turing Machine, it has much to do with the concept.

In a FSM you have a set of states, and you read a queue composed by symbols. At each state, each posible symbol will lead to a new state, which can be the same state. You can associate actions to states, which is very useful.

To approach this task using a FSM, you could have this set:

[Image: fsm.gif]

Quote:
Moneo Wrote:1) It's possible that /* or */ or // could appear inside of a quoted string, and some quoted strings are delimited by double quotes (") and others by single quotes ('). So every time the program finds one of these supposed comment delimeters, it would have to do several scans to make sure that they were not found within quoted strings, which of course would have to be ignored. This is not as easy as it sounds. Watch out, 'cause the comments themselves could contain either of the quotes or even embedded delimiters themselves.

Well that makes the challenge a little harder then! That way people's entries will vary and I'll be able to judge which program is better.

That would be just bad practice. That is not allowed in C/C++ at all, so this parser would allow files that won't compile.

Anyhow, this can be easily done using a single counter. If you find "/*" add to the counter, if you find "*/" substract from it. When counter = 0 you are outside comments. When counter>0 you are in any nested counter.

Note that if counter < 0 there is an error about unbalanced delimiters.
SCUMM (the band) on Myspace!
ComputerEmuzone Games Studio
underBASIC, homegrown musicians
[img]http://www.ojodepez-fanzine.net/almacen/yoghourtslover.png[/i
Reply
#29
Eh? In php, echo "/*" would echo, exactly, /*

It wouldn't start a comment... how does it work in C?
Reply
#30
Exactly the same way. That's why I added a extra state for strings. If you find " you enter the string state. Inside it everything is accepted, until you find another " then you exit.
SCUMM (the band) on Myspace!
ComputerEmuzone Games Studio
underBASIC, homegrown musicians
[img]http://www.ojodepez-fanzine.net/almacen/yoghourtslover.png[/i
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)