CGI debugging problems

ale870 · Post by **ale870** » Sat Feb 02, 2008 11:34 am

Hello,
I'm creating a big application using CGI under Apache.

My problem is debugging.

I still use simple "println" functions, but it is very annoying.
Is there any other more "productive" system (I cannot use TRACE since I cannot interact with the shell!).
If there is not any more productive system, I'm planning to create a remote client console with a server library, so in the CGI one can send (or even interact!) with the console, using sockets.

Do you know if there is an efficient way to debug CGI or I will build such debugging tool :-)
Thank you!

cormullion · Post by **cormullion** » Sat Feb 02, 2008 3:10 pm

I don't know the answer...! :(

I also use println when problems have to be resolved during a CGI run. I try to write and test as much code as I can outside of a CGI situation for this reason.

Jeff · Post by **Jeff** » Sat Feb 02, 2008 3:20 pm

Here are two solutions. Using the built-in newlisp web server, you could probably arrange for an instance of newlisp to server your pages and write a function that outputs debugging info on the shell. The other is to write a debugging function that writes to a file, then opening a shell and running `tail -f debug.log`, substituting the name of your log file. Your function could just be something simple like:

Code: Select all

(define (debug msg)
  (set 'debug-log (open "debug.log" "write"))
  (write-buffer 'debug-log (string msg "\n")))

ale870 · Post by **ale870** » Sat Feb 02, 2008 4:05 pm

Well, thank you!
But it seems there is no definitive or so good solution...
I was thinking to create a set of functions (inside a "debugger" context) to be used like "breakpoints".

Ok, this is all the story:
1) remote debugging could be activated with a function like "(remote-trace true client-console-ip client-console-port)

2) when one wants to send some information to the client console (remote console), you could do some like: "(remote-println .... )" or "(remote-print ... )".

3) Furthermore, I could create some blocking functions (in order to interact with the running CGI). Something like: "(remote-interaction)". This function is blocking (CGI will be stopped until the function completes). In this case the function (remote-interaction) could get data/send data with the remote console, and the remote programmer could get vars info, set values in vars, etc... We could even create standard functions to send to remote client some standard info (memory free, vars used, contexts created, etc...).

This is a starting point. I think this great language needs a step forward in this area. Furthermore, server-side debugging is a big problem in almost any language, and I think , if we could create some good tools, we will give to newLisp a step ahead vs the other languages! :-)

Jeff · Post by **Jeff** » Sat Feb 02, 2008 10:25 pm

You would need to write either a custom server or a plugin for another server (a mod_newlisp for apache, for example) in order to interact with the server on that level.

ale870 · Post by **ale870** » Sat Feb 02, 2008 10:48 pm

I will write a server console for the client programmer, then a small library (that will act as a client of our server console).

EDIT: library should be used in the CGI.

Tim Johnson · Post by **Tim Johnson** » Tue Feb 05, 2008 4:59 pm

I happen to be brand new to newlisp, but I have been a CGI programmer
for 12 years. I'm looking forward to the console you are referring to and
also looking forward to using the newlisp web server.
I use several approaches:
first of all, I have a static IP address, that
means that I can do things on remote servers that are "visible" to me
but not visible to others if I have to work with an already deployed application.
And the key to that is the REMOTE_ADDR environment variable.
I do in fact use print statements all the time, they are my most productive
debugging tool.
Rebol is wonderful for this because it has the ?? and probe functions so
I can dump complex data structures. As simple breakpoint can be done
via a function using a printout plus a halt statement.
With python's __repr__ and __str__ builtin methods, one can build
debugging directly into objects...

Secondly, two different types of logging, one type of logging is a simple
.dbg file that is produced every time an application runs, contains analysis
of the CGI environment, including env variables, posted name/value pairs
and path part components, as an example. In addition I use logfiles with
a rotation mechanism. One thing I haven't done, but should
is an IP-address contextual logfile.
Example: remote address=111.111.111.111,script = myscript.py => myscript.1111_1111_1111_1111.dbg

Thirdly, robust applications are "wrapped" in code that catches errors,
reproduces the state of the application at breakdown time as a URL with
a query string, logs the error and emails me the same.

And it gets really hairy when you're using AJAX :-).
I've started on a rewrite of the standard CGI module that comes with
the newlisp distro. Two things it doesn't handle are:
1)Redundant name/value pairs: Example - a series of check boxes
with the same name.
I modified the code so that a list is created for the values if such an
occurrance
2)A text or textarea field is not filled in, it is still transmitted. If a checkbox
is _not_ checked, the name is not transmitted.
This creates problems when you are using a form to edit an existing dataset or record

Jeff · Post by **Jeff** » Tue Feb 05, 2008 6:02 pm

The cgi module also does not let you know if a parameter was obtained via GET or POST, which is not safe for submitted form data.

ale870 · Post by **ale870** » Tue Feb 05, 2008 6:10 pm

Rebol... I know Rebol very well, since I come from Rebol.
I created several CGI in the time, but I "opened" the eyes when I started to develop JSP, Javabeans, etc...
When you use NetBeans (but even Eclipse) you can really debug, step-by-step, an application running in the server.
As you did, I used "print" statements, log files, etc... but I found this system really slow compared to a real-time debugger.
Since I cannot create a such sophisticated tool, I was trying to create something similar.

Imagine if you could do a CGI in this way:

(debugger-trace true)

(println "I'm here!")
(setq myName "Alessandro")
(debugger-stop) ; <-- at this point the system will show you a console in the client, and you can interact with commands like: (println myName)(continue)

(debugger-print myName) ; <-- this is another option. Instead of printing debbuging information in the client (web browser?), I can redirect, in real-time, debugging info in my client console. I could even try to create a sophisticated printing system, similar to "probe" in Rebol.

Dont't you think this could be a useful tool?

Tim Johnson · Post by **Tim Johnson** » Tue Feb 05, 2008 6:49 pm

Jeff wrote:The cgi module also does not let you know if a parameter was obtained via GET or POST, which is not safe for submitted form data.

Thanks for bringing that up Jeff. Duly noted.

Tim Johnson · Post by **Tim Johnson** » Tue Feb 05, 2008 6:58 pm

--Alessandro, that sounds like a great idea.
Rebol scares me, because RT (rebol development team)
provides no support for the existing distribution of rebol on 64-bit linux.
Rebol doesn't recognize or can't load the 64-bit SOs. For me to get it to
work I have to:
1)Install 32-bit SOs under different names, no difference in byte length.
2)Edit the names of the SOs inside of the binary, taking care not to
change byte alignment.
And then, I still couldn't get DNS to work, IOWS had to reference services
by raw I.P. address instead of URI.
RT has shown no concern about this, and I really felt abandoned and a lot
of rebol users are going to feel abandoned when their domain hosters
switch to 64-bit OSs - which they will do eventually.
I do understand that rebol runs okay on 64-bit Windows Vista.
RT has a 64-bit compatible REBOL3 distro in the works, but you won't
know when that happens until after it happens.
This is one of the reasons I am looking at newlisp.
------
Tim

ale870 · Post by **ale870** » Tue Feb 05, 2008 9:20 pm

Tim Johnson wrote:
Jeff wrote:The cgi module also does not let you know if a parameter was obtained via GET or POST, which is not safe for submitted form data.
Thanks for bringing that up Jeff. Duly noted.

Hello guys, this is not correct. GET and POST data "arrive" to CGI using two different ways.
GET uses environment variable QUERY_STRING (all params are sent via url), but POST send via sockets.

I decided to adopt newLisp and "abandon" Rebol for similar reasons:

1) Rebol is not GPL (no source code, only for some functions source is available).
2) Rebol team does not develop components in a regular way (where is Rebol IOS? and the widgets?) Where are the 25 platforms promised? etc...
3) I don't agree about AltMe usage. It creates a restricted developers team, and they are not interested to spread Rebol.

(more and more... but I stop here :-) )

I really like newLisp, and I like the fact newLisp is GPL. I'm free to collaborate and I know other people are happy to do the same. I'm a part of a giant group of developers, and my contribution is Italian blog, and some scripts that I'm creating :-)

Furthermore, when I will need newLisp for 64-bit, I have source code, and I could try to compile for that platform ;-)

Tim Johnson · Post by **Tim Johnson** » Tue Feb 05, 2008 9:40 pm

I thought perhaps Jeff was referring to the way in which the existing newlisp CGI
module interpolates things. The "REQUEST_METHOD" env variable should
carry the transmission method also.

Yeah, I think it would be easier for me to transition from rebol to newlisp
than from rebol to python.
cheers
tim

Jeff · Post by **Jeff** » Tue Feb 05, 2008 9:46 pm

I was referring to the fact that the cgi module *stores* the parameters identically. If I post the data foo="bar" to the url /some.cgi?foo=baz, I will only get foo="bar", because POST data is added to the list after GET data. Additionally, there is no record kept of which params come from post and which come from get. I could intuit this since I can still access QUERY_STRING, but this is still bad practice.

I am working on a much more extensive and updated request/response module to be used for CGI. I will post it once it's complete.

ale870 · Post by **ale870** » Tue Feb 05, 2008 10:09 pm

Now I understood what you say.
But why do you say that using "QUERY_STRING" is a bad procedure?
Because newLisp has another way to access to GET and POST data?

<<I am working on a much more extensive and updated request/response module to be used for CGI. I will post it once it's complete.>>

Are you extending standard cgi module or you are creating a new one?
(I think you are doing a great job!).

ale870 · Post by **ale870** » Tue Feb 05, 2008 10:55 pm

One question: if CGI, now, maybe is not the best way to use newLisp, what can I do to make a server application? Do I need to use newLisp server?
Or... what? Can you help me to find an alternative way to CGI?

Jeff · Post by **Jeff** » Tue Feb 05, 2008 11:24 pm

The alternatives to CGI are to write a mod_newlisp for Apache or fast cgi adapter. At the moment, CGI is all there is for newLISP. It's only the CGI module that I feel needs to be reworked. My version will be new, although I may steal some code here and there :).

There is no problem with using QUERY_STRING. You didn't understand what I meant. The cgi module first parses QUERY_STRING and fills the parameters list with those. Then it parses the POST data and fills the list with those, overwriting any similarly-named named GET variables. Both sets of key-value pairs are put into the same list. There is no way to know if a key came from GET or POST data, which is unsafe. It means that someone could confuse the server by posting a variable to a CGI that overwrites something in the query string.

I was saying that I could try and guess whether one of the keys in the parameters list was set in GET or POST by whether it appeared in QUERY_STRING. However, if a key appears in both, there is no way to get the GET value back (since it was overwritten by the POST version).

ale870 · Post by **ale870** » Tue Feb 05, 2008 11:53 pm

Thank you Jeff for the clarification.
One question more:

newLisp server with or without xinetd or inetd is a realiable system or not for an intranet or distributed processes over a private (company) network?

For intranet only, is it a good idea using newLisp server to serve private web / rich client applications or is better using apache and standard CGI?

Jeff · Post by **Jeff** » Wed Feb 06, 2008 1:00 pm

The documentation says it's ok to run behind a firewall. I think it's unsafe though, because there is absolutely no form of authentication.

If someone gets behind the firewall, the system is compromised. If you have a wireless access point behind the firewall (since there is no form of wireless security that can't be fairly easily broken at the moment), the system is compromised. If you don't want every single user on your LAN to be able to send potentially harmful commands to your newLISP server, the system is compromised.

Jeff · Post by **Jeff** » Wed Feb 06, 2008 9:19 pm

I've posted the work I have completed on the Request class. It's not complete on its own. It will be accompanied soon by a Response class for output as well.

http://www.artfulcode.net/projects/newl ... est-class/

ale870 · Post by **ale870** » Wed Feb 06, 2008 10:36 pm

Thank you!
I'm working on distributed computing just to see if I can get good results with persistent connections and not using CGI and Apache ;-)