Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Sat 22 Nov 2014, 09:33
All times are UTC - 4
 Forum index » House Training » HOWTO ( Solutions )
Linux vs Windows: file names and file type associations
Moderators: Flash, Ian, JohnMurga
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 1 [6 Posts]  
Author Message
kethd

Joined: 20 Oct 2005
Posts: 451
Location: Boston MA USA

PostPosted: Sat 03 Dec 2005, 20:57    Post subject:  Linux vs Windows: file names and file type associations  

In Windows, file names matter -- the file extension is used to associate the file with the type of program that will be used to execute/process files of that type.

In Linux, files can be named most anything, and the file extensions are mostly not needed/used, except maybe by the humans looking at the files to keep track of things.

Instead, the first line inside the files is used with a special #! command to tell the system what program to use to process the file.

============
http://www.halley.cc/ed/linux/newcomer/filename.html

Don't Judge a File by its Filename

Many people just assume the computer will do the right thing with their files, without understanding how the computer could possibly know what the right thing may be. Double-click a file or a web link and it magically opens the correct program to load that data. How does a computer associate a given file with a given program?

There are three basic approaches used on today's computers to associate various files with application programs.

* explicit external type declarations (scripts and mime)
* implicit naming type declarations (filename extensions)
* implicit data analysis declarations (magic and shebang)

An explicit external type declaration is formed when some external piece of information explicitly describe's the type of data, such as a command script or a MIME header like "text/plain".

Web browsers often rely on the use of explicit types like MIME headers, to know how to render a given website or image. The web server prefaces each element on every file requested with a few lines of text, and MIME type is one of the things included in that header. The web browser reads those lines but does not display them. Email attachments also usually require MIME types to make sense of them.

An implicit naming type declaration is managed through an external association table that is based on the naming conventions for the file, such as a filename's ending extension like ".bat" or ".jpg".

MS-DOS and Windows is heavily dependent upon the filename extension to know what application should be run. MS-DOS knows that .BAT, .COM and .EXE files are executable. Windows maintains a table of "file associations" that identifies that .TXT files should be opened with Notepad, that .XLS files should be opened with Microsoft Excel, and so on.

A typical Windows system may have a few hundred file types listed in that file association table. When developers create new software programs, they have to choose their filename extensions carefully to avoid conflicting with other existing programs. Indeed, some extensions may be ambiguous and some applications clobber that table's entries for accidental or competitive reasons.

Additionally, Windows often tries to hide file extensions from users. Trojan and virus programs often use inconsistent behavior to their advantage: the filename BritneySpears.jpg.exe looks to the user to contain a harmless JPEG-type image, but appears to the system to contain an executable program.

Some web server applications internally rely on such implicit type associations as well, when explicit type data are not available.

An implicit data analysis declaration is an internal or external association that is based on peeking at the actual data in the file, such as magic numbers or the shebang notation like "#!/usr/bin/perl". More about this shebang notation is in the following section.

Unix and Linux rely mostly on this scheme for file type determination. Any file may contain any kind of data, and the name itself doesn't matter. (In fact, most commands are just script files that have no filename extension.) When new file data types become somewhat widespread, a developer adds some tips and heuristics to "fingerprint" the data to correctly recognize files of that type in the future. A command called file can apply those heuristics to any given file, and report back the file's data type, if known. Since most of the heuristics rely on the first two or four bytes of a file, it's quick and pretty reliable, it can't be fooled by just renaming the file, and gives more possibilities than three or four letters in a filename. These heuristic bytes are often called the 'magic number' to describe its type.

The file command can be used at any time to find out the sort of data that is found in a given file. Just provide the filename of the file to examine. Here is the output when given the bash executable file for a test.

$ file /bin/bash
/bin/bash: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
dynamically linked (uses shared libs), stripped

To demonstrate that it's the contents and not the filename, try something like the following.

$ file britney.jpg
britney.jpg: JPEG image data, JFIF standard 1.01, resolution (DPI), 72 x 72
$ mv britney.jpg britney.exe
$ file britney.exe
britney.exe: JPEG image data, JFIF standard 1.01, resolution (DPI), 72 x 72

Some helpful manual pages on your Linux system may be (man ls), (man file), and (man chmod).

Some helpful google searches may be linux mime file types, and linux file formats and magic numbers.

Next: The Whole Shebang, or What's in a Script >
Text, code, layout and artwork are Copyright 1996-2005 Ed Halley.
Copying in whole or in part, with author attribution, is expressly allowed.


http://www.halley.cc/ed/linux/newcomer/shebang.html
The Whole Shebang, or What's in a Script
A shell script is merely a list of commands to be executed in the proper order by a shell environment like bash. A shell script can do anything that you could type manually. Conversely, you could type anything that a shell script contained, to do the same tasks manually.

There are many shell script interpreters, and some that are not even intended for use as an interactive command prompting space for users to type their commands. For example, Perl scripts may start with a shebang line similar to #!/usr/local/bin/perl. They're still interpreted by that process and any code within is executed on behalf of the user who invoked the command.

Also, note that the filename has nothing to do with the type of shell required for running the script. The script could have been called sample, or sample.pl or even kernel.exe. In Unix and Linux, it is the contents (such as its shebang), and not the name, which determines how the system will go about executing or opening the file. Many of the commands you run in Linux are just shell scripts that have no filename extensions.

Some helpful manual pages on your Linux system may be (help source), (man bash), (man chmod), and (man perl).

Some helpful google searches may be linux shell commands scripts, linux shebang notation, linux bash PATH variable, linux file mode permissions, and common scripting languages.
Back to top
View user's profile Send private message 
Flash
Official Dog Handler


Joined: 04 May 2005
Posts: 11153
Location: Arizona USA

PostPosted: Sat 03 Dec 2005, 23:33    Post subject:  

Hey, thanks! That is very educational.
Back to top
View user's profile Send private message 
Flash
Official Dog Handler


Joined: 04 May 2005
Posts: 11153
Location: Arizona USA

PostPosted: Sun 04 Dec 2005, 11:58    Post subject:  

Would whoever moved this to the Howto section please explain why? It doesn't seem to me that it belongs here.
Back to top
View user's profile Send private message 
muskrat

Joined: 03 Jul 2005
Posts: 24
Location: Gulf Coast TX-MX

PostPosted: Mon 23 Jan 2006, 12:21    Post subject:  

Flash, I didn't move it. But it seems like a good place to be.
Quote:
Would whoever moved this to the Howto section please explain why? It doesn't seem to me that it belongs here.


Howto understand files types!

Just my humble opion. By the way I think it's very educational too. Very Happy

_________________
Steve (Muskrat) McMullen
http://www.muskratsweb.com
Registered Linux User #305785
Back to top
View user's profile Send private message Visit poster's website Yahoo Messenger 
jmarsden


Joined: 31 Dec 2005
Posts: 263
Location: California, USA

PostPosted: Mon 23 Jan 2006, 16:39    Post subject:  

It might be worth expanding on this concerning the way Rox, Puppy's default file manager, does operate more by "extension" than by file type... so those who want special actions by Rox when files of particular kinds are clicked on need to set up what are in effect "file associations" for Rox, and users will need to retain file extensions carefuly when renaming files if they want Rox to continue to behave as expected.

The Unix/Linux approach, and the use of the file utility and its associated 'magic' data file to determine what sort of thing a particular file is, is well defined, extensible and powerful. But the chosen file manager in Puppy doesn't currently seem to make much use of that power, instead requiring that specific extensions on filenames be mapped to MIME types. This is why (for example) we need to have .pup files and .get files named with those extensions, if we want Rox to do something specific with them when they are clicked on, rather than just adding appropriate invariant data to our file formats and adding info on them to the 'magic' file.

The info in the article about how Linux generally handles figuring out file types is correct and informative, but it fails for one case most Puppy newcomers are likely to be interested in... how their file manager handles files and file type determination!

Adding pointers to the file(1) and magic(5) man pages for those who want to dig slightly deeper into the Unix approach would also be good:
  • http://www.die.net/doc/linux/man/man1/file.1.html
  • http://www.die.net/doc/linux/man/man5/magic.5.html

Jonathan

Last edited by jmarsden on Sun 29 Jan 2006, 03:20; edited 1 time in total
Back to top
View user's profile Send private message 
jmarsden


Joined: 31 Dec 2005
Posts: 263
Location: California, USA

PostPosted: Mon 23 Jan 2006, 17:02    Post subject:  

A quick example of the power of 'file':
Code:
root@m1:/home/consulting# file sendmail-8.13.5-2.1.src.rpm
sendmail-8.13.5-2.1.src.rpm: RPM v3 src i386 sendmail-8.13.5-2.1
root@m1:/home/consulting# cp -p sendmail-8.13.5-2.1.src.rpm junk
root@m1:/home/consulting# file junk
junk: RPM v3 src i386 sendmail-8.13.5-2.1
root@m1:/home/consulting#

Even without any extension at all, the file command still knows that the file 'junk' is an RPM, version 3, source, for i386, and was created with the name sendmail-8.13.5-2.1 -- I could have renamed the file to junk.deb or junk.foo and got similar results. You can get colour depth and pixel dimensions from many image file formats this way, too.

While the above example isn't directly useful (!), it shows that files can be easily checked to see what sort of file they are, without artificially restricting how we name them. Oh, and if for some reason I actually need to know the MIME type for junk, I can find it out without having to resort to application-specific configuration files:
Code:
root@m1:/home/consulting# file -i junk
junk: application/x-rpm
root@m1:/home/consulting#

Jonathan
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 1 [6 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » House Training » HOWTO ( Solutions )
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0780s ][ Queries: 12 (0.0075s) ][ GZIP on ]