Thanks that looks like what I need
further update...
Ok so I spent a fun evening encoding the current elf em_machine list.
Started to freak out thinking where i was going to get code samples for over 200+ device familys.. feeling dejected I thought well perhaps i can just tweak the elf header and get the machine to think its a file from a diffrent machine family...
loading up tweak I spot the em_machine number (03 for intel) and I change it to 02, lets make this a file from a SPARC...
Success, well partial. however I get this message..
Code: Select all
tiny.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
sh-3.00# tweak goat.o
sh-3.00# file goat.o
goat.o: ELF 32-bit LSB relocatable, SPARC - invalid byte order, version 1 (SYSV), not stripped
sh-3.00# tweak goat.o
sh-3.00# file goat.o
goat.o: ELF 32-bit LSB relocatable, AT&T WE32100 - invalid byte order, version 1 (SYSV), not stripped
What's up with the invalid byte order?
Thinking things through, it probably was a bit newbish to assume that I could fool file program to accept that this was a legit file from another machine type when the other header flags probably didnt make sence for the machine. (example, perhaps SPARC does not have 32bits? perhaps its a 64bit ect?) I don't know...
This presents a bit of a chore as I now need to get the documentation for the 200 odd processor/chip familys and figure out the elf file setup.
just assembling the documentation will take ages...
update...
As I suspected.
Code: Select all
sh-3.00# gcc -Wall tiny.c
sh-3.00# ./a.out ; echo $?
42
sh-3.00# wc -c a.out
7122 a.out
becomes...
Code: Select all
sh-3.00# ld -s tiny.o
sh-3.00# ./a.out ; echo $?
42
sh-3.00# wc -c a.out
240 a.out
And that is without messing to much with the file. ok I lost portability with this method but its now a 29th of the size of the original file! hooray!
But once you start its hard to stop, optimization I have come to find is programming equivalent of crack.
Looking at the program, there is still a fair bit of fat on this... it can be reduced further i think... according to the tutorial its possible to get it to 45 bytes... or less than 158th of the original size.. nice.. Now i can see how embedded system programs can function on 4096 bytes of memory...
The program as you can see was very simple, just assigns value 42 (the ultimate answer!)
time to mess with the program some more.
Excellent tutorial!
http://www.muppetlabs.com/~breadbox/sof ... eensy.html
only thing i found was the comment is already removed in my version of gcc (all good.. no need to remove that like in the example.)
only problem I have so far is ...
Code: Select all
sh-3.00# gcc -s -nostdlib tiny.s
tiny.s: Assembler messages:
tiny.s:1: Error: no such instruction:
hmmm.... time to go further down the rabbit hole.
In fairness I can understand why the libraries in C are bloated like that, you need a lot of redundant functionality to provide the portability thats necessary for many things.
This is also only a trivial program, with a real application this would be a much more painful process for sure.
Code: Select all
sh-3.00# nasm -f bin -o a.out tiny.asm
sh-3.00# chmod +x a.out
sh-3.00# ./a.out ; echo $?
42
sh-3.00# wc -c a.out
91 a.out
91 bytes after rolling my own elf header...
nice size reduction from the 240 bytes. no real magic yet...
Like everything else in life the last mile is the hardest i guess...
Oh yes, from this point on, it was serious mangling of the file format. But surprisingly the current implementation of the linux kernel does not actually do a whole lot with the ELF data, you can use the portions that it does not care about to store bytes for other uses..
As I suspected from the beginning the padding could be used to store program code!
But in all honesty, 80% of the work was in getting that last 46 bytes. Its just not worth it.
But thing is getting to the 91bytes was not a massive slog that I thought it would be. with that kind of size optimization I would be well chuffed even with this.
That said it is extremely cool.
Code: Select all
sh-3.00# nasm -f bin -o a.out tiny.asm
sh-3.00# chmod +x a.out
sh-3.00# ./a.out ; echo $?
42
sh-3.00# wc -c a.out
45 a.out
45 BYTE executable....
But then I got to thinking... exit interrupt was pretty useful. I wonder what the other interrupts do...
Look up the file
"/usr/include/asm/unistd.h" 5L, 82C
what the dickens...
Code: Select all
# ifdef __i386__
# include "unistd_32.h"
# else
# include "unistd_64.h"
# endif
No big deal, looks like that tutorial was written before the advent of 64bit libraries for the assembler...
anyway even a simpleton like myself can see its just a straight choice between 64 and 32 bit libraries... lets go with the sane choice (as this is puppy) and go for unistd_32.h..
Code: Select all
#ifndef _ASM_X86_UNISTD_32_H
#define _ASM_X86_UNISTD_32_H
...
#define __NR_restart_syscall 0
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
#define __NR_waitpid 7
#define __NR_creat 8
#define __NR_link 9
#define __NR_unlink 10
#define __NR_execve 11
#define __NR_chdir 12
#define __NR_time 13
#define __NR_mknod 14
#define __NR_chmod 15
#define __NR_lchown 16
#define __NR_break 17
#define __NR_oldstat 18
#define __NR_lseek 19
#define __NR_getpid 20
#define __NR_mount 21
#define __NR_umount 22
#define __NR_setuid 23
#define __NR_getuid 24
#define __NR_stime 25
#define __NR_ptrace 26
#define __NR_alarm 27
#define __NR_oldfstat 28
#define __NR_pause 29
#define __NR_utime 30
#define __NR_stty 31
#define __NR_gtty 32
#define __NR_access 33
#define __NR_nice 34
#define __NR_ftime 35
#define __NR_sync 36
#define __NR_kill 37
#define __NR_rename 38
#define __NR_mkdir 39
#define __NR_rmdir 40
#define __NR_dup 41
#define __NR_pipe 42
...
Wait - thats not such a mind melt, these are system commands! hmmmmmmm this might not be so bad!
So here is a question, where can I find the other system command list for other systems? Perhaps I am being naive but I would assume that it follows a similar logic say for embedded devices ect?
As a simple exercise I would like to write a script to enable cross-platform compilation of the code.
All it would need is the table of interrupts, I pass what system I want to run the code on and it writes a asm file for me after checking what the interrupt code is on that particular platform.