Finding the attributes of Chinese filenames (Win32)

Machine-specific discussion
Unix, Linux, OS X, OS/2, Windows, ..?
Locked
axtens
Posts: 28
Joined: Mon Apr 06, 2009 12:23 pm
Location: Perth, WA Australia
Contact:

Finding the attributes of Chinese filenames (Win32)

Post by axtens »

The following NewLISP code shows me the file attributes of files under Win32. However, some of the filenames retrieved have Chinese characters in the name. When the GetFileAttributesA function encounters them, it gives me a -1 for the attribute. I looked at GetFileAttributesW but don't know how to make the contents of the fname available to the function in a form it recognises.

How does one handle this situation?

Code: Select all

(define (get-archive-flag file-name)
    (if (not GetFileAttributesA)
        (begin
        (import "kernel32.DLL" "GetFileAttributesA")
        )
    )
    (setq fname file-name file-attrib (GetFileAttributesA (address fname)))   
    (append fname " " ( string file-attrib))    
)

; walks a disk directory and prints all path-file names
;
(define (show-tree dir)
    (if (directory dir)
        (dolist (nde (directory dir))
            (if (and (directory? (append dir "/" nde))
                (!= nde ".") (!= nde ".."))
                (show-tree (append dir "/" nde))
                (println (get-archive-flag (append dir "/" nde)))
            )
        )
    )
)

(show-tree "z:\\working files\\Cathy")

m35
Posts: 171
Joined: Wed Feb 14, 2007 12:54 pm
Location: Carifornia

Re: Finding the attributes of Chinese filenames (Win32)

Post by m35 »

Is there a reason you can't use the built in file-info function?

If you have to use the GetFileAttributes function with unicode file names, you can use the UTF-8 version of newLISP, along with this function to convert the UTF-8 paths to UTF-16 which can then be passed to GetFileAttributesW.

Code: Select all

(constant 'SIZEOF_WCHAR 2) ; assumption

(define (utf8->16 lpMultiByteStr , cchWideChar lpWideCharStr ret)

    ; calculate the size of buffer (in WCHAR's)
    (setq cchWideChar (MultiByteToWideChar
        CP_UTF8 ; from UTF-8
        0       ; no flags necessary
        lpMultiByteStr
        -1      ; convert until NULL is encountered
        0
        0
    ))
   
    ; allocate the buffer
    (setq lpWideCharStr (dup " " (* cchWideChar SIZEOF_WCHAR)))
   
    ; convert
    (setq ret (MultiByteToWideChar
        CP_UTF8 ; from UTF-8
        0       ; no flags necessary
        lpMultiByteStr
        -1      ; convert until NULL is encountered
        lpWideCharStr
        cchWideChar
    ))
    (if (> ret 0) lpWideCharStr nil)
)

axtens
Posts: 28
Joined: Mon Apr 06, 2009 12:23 pm
Location: Perth, WA Australia
Contact:

Re: Finding the attributes of Chinese filenames (Win32)

Post by axtens »

Wow, cool code!
Is there a reason you can't use the built in file-info function?
Perhaps I'm not seeing something that's right in front of me, but the manual doesn't say anything about returning the 'archive bit' status when using file-info.

m35
Posts: 171
Joined: Wed Feb 14, 2007 12:54 pm
Location: Carifornia

Re: Finding the attributes of Chinese filenames (Win32)

Post by m35 »

axtens wrote:Wow, cool code!
Is there a reason you can't use the built in file-info function?
Perhaps I'm not seeing something that's right in front of me, but the manual doesn't say anything about returning the 'archive bit' status when using file-info.
Oh ok, then you're right to use the Win32 api to get that platform specific attribute.

Playing around with unicode on Windows can be tricky. Wish I could direct you to a good comprehensive source of info about how to deal with it, but I've never seen comprehensive info like that (had to figure it out on my own). Maybe this thread I wrote years ago might also help a bit (the functionality described has since been integrated directly into newLISP, thus making it obsolete--but it's a nice reference).

axtens
Posts: 28
Joined: Mon Apr 06, 2009 12:23 pm
Location: Perth, WA Australia
Contact:

Re: Finding the attributes of Chinese filenames (Win32)

Post by axtens »

@m35, your help is very much appreciated.

It seems a little weird to me doing the slice on the reverse of the bits but I couldn't find any bit_and functionality anywhere (quickly).

Thanks,
Bruce.

Code: Select all

;code from m35
(constant 'SIZEOF_WCHAR 2) ; assumption
(constant 'CP_UTF8 65001)

(define (utf8->16 lpMultiByteStr , cchWideChar lpWideCharStr ret)
	(if (not MultiByteToWideChar)
		(begin
		(import "kernel32.DLL" "MultiByteToWideChar")
		)
	)
	; calculate the size of buffer (in WCHAR's)
	(setq cchWideChar 
		(
		MultiByteToWideChar
		CP_UTF8 ; from UTF-8
		0       ; no flags necessary
		lpMultiByteStr
		-1      ; convert until NULL is encountered
		0
		0
		)
	)
   
	; allocate the buffer
	(setq lpWideCharStr (dup " " (* cchWideChar SIZEOF_WCHAR)))
   
	; convert
	(setq ret 
		(
		MultiByteToWideChar
		CP_UTF8 ; from UTF-8
		0       ; no flags necessary
		lpMultiByteStr
		-1      ; convert until NULL is encountered
		lpWideCharStr
		cchWideChar
		)
	)
	(if (> ret 0) lpWideCharStr nil)
)

; resets the Win32 archive flag on a file
; By CaveGuy 2009

(define (get-archive-flag file-name)
	(if (not GetFileAttributesW)
		(begin
		(import "kernel32.DLL" "GetFileAttributesW")
		)
	)
	(setq fname file-name
		file-attrib (GetFileAttributesW (utf8->16 fname))
	)   
	file-attrib   
)

; walks a disk directory and prints all path-file names
;
(define (show-tree dir)
	(if (directory dir)
		(dolist (nde (directory dir))
			(if (and (directory? (append dir "/" nde)) (!= nde ".") (!= nde "..") )
				(show-tree (append dir "/" nde))
				(begin
					(setq fname (append dir "/" nde))
					(setq fflag (get-archive-flag fname))
					(setq fbits (bits fflag))
					(if (= (slice (reverse fbits) 5 1) "1") (println fname))
				)
			)
		)
	)
)

(show-tree "//iibt-spare/temp/Scans")

m35
Posts: 171
Joined: Wed Feb 14, 2007 12:54 pm
Location: Carifornia

Re: Finding the attributes of Chinese filenames (Win32)

Post by m35 »

axtens wrote: It seems a little weird to me doing the slice on the reverse of the bits but I couldn't find any bit_and functionality anywhere (quickly).
In case you haven't stumbled across them yet, here are the bit operators.

Locked