Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Sat 18 Nov 2017, 10:13
All times are UTC - 4
 Forum index » Advanced Topics » Cutting edge
Universal package database format
Moderators: Flash, Ian, JohnMurga
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 2 [24 Posts]   Goto page: 1, 2 Next

Would you support a new XDG spec for packages?
Yes
100%
 100%  [ 3 ]
No
0%
 0%  [ 0 ]
Maybe (with comment)
0%
 0%  [ 0 ]
Total Votes : 3

Author Message
technosaurus


Joined: 18 May 2008
Posts: 4756
Location: Kingwood, TX

PostPosted: Wed 25 Oct 2017, 00:15    Post subject:  Universal package database format  

It is way past time that Linux/BSD got a package specification format similar to the XDG desktop entry spec and the menu spec

I am bringing this up because I read this week's distrowatch article on Ravenports. The article makes it sound great, but I'm a "show-me-the-code" kinda guy and OMFG what a total mess - it makes the mozilla source tree look clean. Everything is in folders named bucketXX with no organization whatsoever and its a new project, so it will proabably only get worse.

I'm not saying there isn't a niche to be filled; there is, but Ravenports does not appear to be the answer.

A better solution would be to implement a new XDG specification for package specs - modeled after the desktop entry specification so that individual packages could output a single file that any distro could grok into their own format (for backward compatibility) or use directly. It could be added into the auto-tools suite or provided as a standalone script.

In addition to the (localized) keys in .desktop files, you would have package sizes, various checksums, (build) dependencies, source url, VCS url, maintainer, license, etc... See links at the end of this post for the random stuff some distros have in their package control files.

The reason I mention the desktop files is that they already contain much of the useful information and it is commonly already localized. For example:
* Localization (as in the desktop files) would allow more user friendly package management for non-english users because each translatable field "Key=" can have a corresponding "Key[lang]=" equivalent
* Mimetype (as in the desktop files) could be used by the package manager to handle an xdg-open of a file with no default handler for its Mimetype. Then the Exec field (as in the desktop file) could open it directly when it is installed... Come on even MS can handle this one (OK, they just use extensions, but the concept is the same)
* XDG specified Categories (as in the desktop files) could be used directly by package managers or mapped to legacy groups for backward compatibility. Using the same categories in the package manager as in the menu would make it more user friendly because it would be in the same relative location in the menu as it was in the graphical package manager. (see my jwm_intstall_menu_create script in the jwm-tools thread)
* The Icon field (as in the desktop file) could be used in graphical package managers, if available, to help the user quickly find what they installed.

Integrating the package info into the build process would use information that is normally already available anyhow and thus eliminate a lot of cross-distro duplicated effort. Most of the rest of the data that isn't contained in the desktop files, is already part of the build process (dependencies, build dependencies, sizes, checksums, maintainer etc...). Just have a package.spec.in file similar to the already existing package.desktop.in files

Other fields that could be useful:
Memory usage, CPU usage, ProjectURI, DonationURI, DocumentationURI, ScreenshotURI, BugsURI, SupportURI, WikiURI, ForumURI, etc...

Then you have the various ways of package splitting: from a simple slackbuild that produces one package per source tarball to splitting out binaries, libraries, development (DEV) files, documentation (DOC), data and localization (NLS) into separate packages. These can be split even further: NLS to each language such as package-*-NLS-en_GB. Split DEV so includes, pkgconfig files, etc... are in one package and the *.so links in a DEV-shared and *.a libs in a DEV-static. Split DOC into man, info, html, etc...

Moving the majority of this stuff into the upstream repositories would be better for upstream developers too. That way they can directly update changes to things like links, repositories, contact info etc... I still see distros that list a package's homepage as freshmeat, berlios, codeplex or google code and have even seen dead guys as the contact. It's better for up to date localization too.

Don't get me wrong, I'm not a big fan of autotools, but an improved version with hooks for different types of build failures that could be integrated with the package manager would be nice. Instead of a cryptic failure message, get something like Can't find "X": _Install, _Build, _BuildStatic, _Disable, _Quit (with install being the default if available followed by _Disable if it can just be disabled) ... but we need _something_. With no defacto standard, each distro does their own thing - badly. For example: Alpine Linux started as a small distro so they don't yet have package categories - just one big pile, but at least they aren't arbitrarily divided into ambiguous bucket** folders. If you look through the distro build systems of your favorite distro (all/any of them) 90% of the BS code is related to this one thing. If each package automatically output this to a specified format, most of that code could be eliminated.

Currently each distro painstakingly does the same thing differently. For example:
Alpine uses these
Arch uses these
Debian uses these
Fedora uses these and these
Solus uses these
Suse uses these
Ubuntu uses these

However the actual databases contain even more (and probably more, I just did the limit in ideone a couple of times)
    Arch
    Architecture
    Breaks
    Bugs
    Build-Depends
    Build-Depends-Indep
    Build-Ids
    Changed-By
    Closes
    Conflicts
    Depends
    Description
    Description-md5
    Enhances
    Filename
    Homepage
    Installed-Size
    Keywords
    Maintainer
    MD5sum
    Multi-Arch
    Npp-Applications
    Npp-Description
    Npp-File
    Npp-Mimetype
    Npp-Name
    Origin
    Original-Maintainer
    Package
    Pre-Depends
    Priority
    Provides
    Recommends
    Replaces
    Section
    SHA1
    SHA256
    Size
    Source
    Standards-Version
    Suggests
    Tads2-Version
    Tads3-Version
    Tag
    Task
    Vcs-browser
    Vcs-Git
    Version


Obviously some of these are superflous: *-Version - just put the version in the dependency (>=2<3 to require version 2.x), Npp-* no reason to handle browser plugins differently (Though I think Mimetype should be part of the package spec)

What fields should be added/removed from the list?
I'd like to hear comments... I'd like to see CPU usage, RAM usage and Mimetype for obvious reasons, but let me know if you have any insightful ideas.

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
sc0ttman


Joined: 16 Sep 2009
Posts: 2548
Location: UK

PostPosted: Wed 25 Oct 2017, 20:21    Post subject:  

Sorry to be a bit thick...

Are you proposing we replace the pet.specs file with a much more info filled, XDG-style/compliant alternative?

Are you proposing we ditch .PET filetypes and choose some other archive type?

Are you proposing the creation of build scripts which contain all this info, and a new tool to build the pkgs?

Are you proposing a totally new pkg manager written from scratch to install/remove them, etc?

I'm assuming "yes" to all, but just trying to get my head around it..

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4756
Location: Kingwood, TX

PostPosted: Wed 25 Oct 2017, 20:32    Post subject:  

sc0ttman wrote:
Sorry to be a bit thick...

Are you proposing we replace the pet.specs file with a much more info filled, XDG-style/compliant alternative?

Are you proposing we ditch .PET filetypes and choose some other archive type?

Are you proposing the creation of build scripts which contain all this info, and a new tool to build the pkgs?

Are you proposing a totally new pkg manager written from scratch to install/remove them, etc?

I'm assuming "yes" to all, but just trying to get my head around it..

Yeah, sorry for the wordiness; I didn't condense my thoughts - just brainstorming at the moment. More like a simple standardized format that individual packages would produce that each distro could easily convert into their own database format (automatically using a script) instead of manually reproducing the work.

Puppy devs would be free to keep the pet.specs and petget and woof.

I am in the early stages of starting a new distro that starts with nothing but a browser, window manager and terminal and seamlessly installs what you need when you need it on the fly. That is why I want mime types in the package database (to bring up choices to handle any opened file) as well as resource usage (to help the user choose which program)

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
amigo

Joined: 02 Apr 2007
Posts: 2611

PostPosted: Thu 26 Oct 2017, 15:32    Post subject:  

This is one of my favorite topics, but it is late here, so I'll have to promise to come back tomorrow. I've designed three or four package formats with some advanced/optional database features. Of course, my src2pkg has been taught to make them so both sides of the problem are laid out together in synergy.
Back to top
View user's profile Send private message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2548
Location: UK

PostPosted: Fri 27 Oct 2017, 10:38    Post subject:  

I just found the GitHub issue you created back in 2015, techno, about improving PetGet and de-coupling it from X..

My Pkg thing actually covers nearly all of that ...

The only area it falls short is that Pkg tends to use grep/cut etc, rather than IFS="|" ..

It shouldn't be too hard to get into Pkg all the stuff you want from a pkg manager (listed here, and in ur Github issue) as it is *mostly* broken up into small funcs..

Anyway, I would be more than willing to add these features into Pkg, *especially* the code clean ups, the IFS stuff, reducing amount of code, etc .... but would prob need some hand holding Rolling Eyes

Oh and amigo, can you maybe PM me the "whats special" about the ".tpkg" format please... I will make Pkg support it, won't be much extra code (I think), but am just interested in whether or not the "tpkg" format is still in use, and what's diff compared to say a PET...

As a possibly related note, I would like to eventually make Pkg support compiling pkgs from many sources - src2pkg build scripts, Alpine build scripts, SlackBuilds, CRUX build scripts, Arch build scripts, etc, etc...

Thanks guys.

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search
Back to top
View user's profile Send private message 
amigo

Joined: 02 Apr 2007
Posts: 2611

PostPosted: Sat 28 Oct 2017, 15:33    Post subject:  

I think the proposal for a universal database or freedesktop spec can only work in the context of a 'platform', like 'android' or CoreOS -the latter being quite likely since FDO is very redhat-friendly. Of course that means that Poettering & Co. (systemd) will rule the roost.

Dependency chains are build-time specific, varying according to configuration options being enabled or disabled, and/or the presence/absence of certain libraries. Since this is so, any two distros are quite likely to have different builds of any given library/program.

Even Legal jurisdiction could lead to differences. Any easy example is ffmpeg or vlc. A distributor in the USA would be constrained from enabling support for certain codecs and would not include them in the build, thus they would not show in the dependency chain. Someone from Iceland would have no such constraints, would include these optional items and have them in their dep-chains.

A more appropriate-to-puppy example would be that here one might try to make things smaller by leaving out support for little-used features which depend on optional libs.

Incidentally, Thomas Leonard, author of rox, is one of the main instigators at freedesktop.org. He came up with two early schemes for standardizing and streamlining software deployment. The first was the AppDir -actually he didn't originate the idea, but did make it more popular than before. Apple's ClassicOS used Applications-in-a-directory. And most ancillary software for UNIX was delivered so as to be (compiled &) run from a separate, single dir.

Leonard's second attempt was ZeroInstall -it really was quite similar to the AppDir concept but uses local caching of software after retrieving it online.

When The One Distro to Rule Them All comes, and we have submitted, then we can have universal anything.
Back to top
View user's profile Send private message 
amigo

Joined: 02 Apr 2007
Posts: 2611

PostPosted: Sat 28 Oct 2017, 16:34    Post subject:  

Scottman, your 'pkg' is based on the most faulty package specification/concept ever to hit us -the .pet -also known as .pup improved. My early work with src2pkg involved trying to 'enhance' the Slackware package format, in the same lame way that zenwalk, salix, porteus and others have.

Slackware has a pretty lame package format, with a lame naming system, lame categories, lame handling of packages and repos, with database files which are hard to parse and very sparse on information. Of course, Slackers, masochists that they are, deny the existence of dependencies -if you install the whole 6GB's, then (most) everything will work. And you can never reproduce the builds of packages since they are never sanely and completely re-compiled -even when the toolchain and runtime changes.

My frustration with the Slackware database made me start to experiment with creating my own package format. 'tpkg' was the first result. The current incarnation is called 'tpm' (Tar Package Manager). They both are tar.xz archives. But, archive compression method is only the tiniest bit of what a packaging system is.

Starting from scratch, I was able to leverage control of packaging naming, database location(think filesystem as a database) as well as all other aspects of packaging. The Slackware system and philosophy was always about the distro doing the least work possible to keep things running. I also built my own distros from scratch -in fact that was what motivated me to write src2pkg. I had the same need to do as much as possible as a one-man show.

That meant that src2pkg needed to be able to generate uniform packages with as little human input as possible. But, it had to be able to blend human input along with what it generated itself. And, it can leverage info from deb, rpm, SlackBuild, PKGBUILD and other sources.

This need, sometimes, for human input is still the bottleneck with all package-building. Say you want to have packages belong to certain categories and identify themselves that way. That means that you need to tell the build script for each package which category it should belong to, so that the info can be included in some spec file which gets included with the package, so that the info can be stored in the database and be available for use by package management.

Another tricky thing is in naming dependencies like pkgname >= 2.0, that is being able to give some _range_ of dependency. This info can only be supplied through human input. IIRC, tpkg was still trying to deal with/use that nuance, but tpm has simplified -it tells you exactly what was used/present for linking at build-time and doesn't try to deal with dep version ranges.

Here's a teaser dir listing from the tpk database ate /var/lib/tpkg:
conflicts
links
manifests
packages
postinst
postrm
preinst
prerm
provides
removed
required
setup
suggests
transactions

For tpm, I have cut that down a bit, by including some info in a general db file for the package as variables. I have designed all the db files and dir structure to make them as easy to parse (or not) as possible. For instance, requires for each package are listed in a separate file for each package, located in a directory where no other files are located. This allows me to search for dependencies and reverse dependencies, returning package names simply by using grep in the right dir -instead of trying to parse such info out of long files with mixed data in them. Also, where best, things are specified as bash-snippets. That is, instead of having this:
NAME: MyPackage
VERSION: 0.0.0
I use this:
NAME=MyPackage
VERSION=0.0.0
That way, I don't have to parse anything at all, I simply 'source' the file and those values are imported and ready to use.

tpm is using these dirs only:
file-lists
history
manifests
packages
requires
setup

Under file-lists are files named like this:
danpei-nls_2.9.7-noarch-1
danpei_2.9.7-i586-1

with content like this:
f:usr/bin/danpei
l:usr/bin/danpei2 -> danpei
d:usr/share/doc/danpei-2.9.7/
f:usr/share/doc/danpei-2.9.7/AUTHORS
f:usr/share/doc/danpei-2.9.7/ChangeLog.en
f:usr/share/doc/danpei-2.9.7/README.en
f:usr/share/doc/danpei-2.9.7/danpei.html
f:usr/share/doc/danpei-2.9.7/danpei.src2pkg
f:usr/share/pixmaps/danpei.xpm
f:usr/share/applications/danpei.desktop
d:usr/share/man/man1/
f:usr/share/man/man1/danpei.1.gz
l:bin/danpei3 -> ../usr/bin/danpei
h:bin/danpei4 -> usr/bin/danpei
tpkg used simple paths, but for tpm I do add extra info here whcih requires parsing to obtain the item type, path and (with l or h the targets of the links

manifests has files with content like this:
./| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/bin/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/bin/danpei| -rwxr-xr-x root/root 250904 2014-07-16 12:08
usr/bin/danpei2| lrwxrwxrwx root/root 0 2014-07-16 12:08 "danpei"
usr/share/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/doc/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/doc/danpei-2.9.7/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/doc/danpei-2.9.7/AUTHORS| -rw-r--r-- root/root 462 2004-08-01 14:44
usr/share/doc/danpei-2.9.7/ChangeLog.en| -rw-r--r-- root/root 758 2005-03-05 11:53
usr/share/doc/danpei-2.9.7/README.en| -rw-r--r-- root/root 17105 2005-03-08 10:34
usr/share/doc/danpei-2.9.7/danpei.html| -rw-r--r-- root/root 8295 2014-07-16 12:08
usr/share/doc/danpei-2.9.7/danpei.src2pkg| -rw-r--r-- root/root 1277 2014-07-16 12:08
usr/share/pixmaps/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/pixmaps/danpei.xpm| -rw-r--r-- root/root 9424 2014-07-16 12:08
usr/share/applications/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/applications/danpei.desktop| -rw-r--r-- root/root 570 2014-07-16 12:08
usr/share/man/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/man/man1/| drwxr-xr-x root/root 0 2014-07-16 12:08
usr/share/man/man1/danpei.1.gz| -rw-r--r-- root/root 1525 2014-07-16 12:08
bin/| drwxr-xr-x root/root 0 2014-07-16 12:08
bin/danpei3| lrwxrwxrwx root/root 0 2014-07-16 12:08 "../usr/bin/danpei"
bin/danpei4| hrwxr-xr-x root/root 0 2014-07-16 12:08 "usr/bin/danpei"
install/| drwxr-xr-x root/root 0 2014-07-16 12:08
install/pkg-spec| -rw-r--r-- root/root 1243 2014-07-16 12:08
install/postinst| -rw-r--r-- root/root 180 2014-07-16 12:08
install/postrm| -rw-r--r-- root/root 186 2014-07-16 12:08
install/pkg-requires| -rw-r--r-- root/root 202 2014-07-16 12:08
This is basically the output from the tar listing, rearranged for easier parsing and hmmm, tpkg also adds an md5sum for every file so one can always verify if a file has changed after installation.

packages files look like this:
package="danpei_2.9.7-i586-1.tpm"
packager="Gilbert Ashley <amigo@ibiblio.org>"
name="danpei"
version="2.9.7"
arch="i586"
release="1"
sig=""
pkg_creation_date="2014-07-16_12:08:16"
uncompressed_size="372 KB"
os_version="KISS 5.0"
host="kiss"
processor="i686 Intel(R) Pentium(R) 4 CPU 2.00GHz"
target_architecture="i586-kiss-linux"
src2pkg_version="3.0"
license="GPL-2"
summary="GTK-1.2 Image Viewer"
description="
Danpei is a Gtk+ based Image Viewer, works on X Window Sysytem.
You can look through your image files in Thumbnail form,
and can rename,cut and paste them easily.
"
sub_packages="danpei-nls_2.9.7-noarch-1.tpm "
kernel='3.14.5 #1 Sun Jun 15 20:28:04 GMT 2014'
toolchain="glibc-2.13 gcc-4.5.3 binutils-2.20.51.0.8.20100412"
source_name="danpei_2.9.7.orig.tar.gz"
source_md5sum="65c5352379d50c7a37a1118713c8dcd5"
patches="
danpei_2.9.7-1ubuntu1.diff.gz
"
build_configuration='
LDFLAGS="-Wl,-O1,-L/lib,-L/usr/lib,--relax,--sort-common,--no-keep-memory"
CFLAGS="-O2 -m32 -pipe -fomit-frame-pointer -fno-strict-aliasing -Wno-shadow -Wno-unused -march=i586 -mtune=i686"
CXXFLAGS="-O2 -m32 -pipe -fomit-frame-pointer -fno-strict-aliasing -Wno-shadow -Wno-unused -march=i586 -mtune=i686"
./configure --prefix=/usr --libdir=/usr/lib
'
(note this file is 'sourceable' so all that detail is easy to get out.)

requires files look like this:
gdk-pixbuf_0.22.0-i586-3
glib_1.2.10-i586-3
glibc_2.13-i586-4
gtk+_1.2.10-i586-3
libX11_1.4.3-i586-1
libXau_1.0.6-i586-1
libXdmcp_1.1.0-i586-1
libXext_1.2.0-i586-1
libpng_1.4.9-i586-1
libxcb_1.7-i586-1
(note that having no comments allows for simple grep usage without pipes.)

setup is simply a temp dir where tpkg/tpm creates files during procedures.

'history' is a flat file which records package transactions, like this:
2013-12-13_21:24:15 danpei_2.9.7-i586-1 installed . 0f5d000cc17df0be414556cf80c577c0 440K 107328 /usr/src/KISS-5/SOURCE/utils/tpm2
2013-12-13_21:24:18 danpei_2.9.7-i586-2 upgraded danpei_2.9.7-i586-1 0f5d000cc17df0be414556cf80c577c0 107328 /usr/src/KISS-5/SOURCE/utils/tpm2
2013-12-15_11:19:34 danpei_2.9.7-i586-1 downgraded danpei_2.9.7-i586-2 0f5d000cc17df0be414556cf80c577c0 440K 107328 /usr/src/KISS-5/SOURCE/utils/tpm2


tpkg is a scaleable system, meaning that some database features are optional and can be turned off. For instance package transactions , auto-backup of packages when removing them, and others.

Whereas tpkg was still using installation scripts (like doinst.sh), tpm experiments with using bash functions, so that instead of having a doinst.sh, the commands are written as a function and can be easily included in the package description file. The reason for the effort is to avoid having any installation scripts - a rouge dev/user could put ANYTHING in there. A safer way is to use 'triggers' which can cause the package manager itself to perform operations -only a limited set of actions are available so no unsafe code can be imported from doinst.sh

I'll stop now for awhile and give you both a chance to respond. I have lots more to discuss if your are still interested.
Back to top
View user's profile Send private message 
wanderer

Joined: 20 Oct 2007
Posts: 503

PostPosted: Sat 28 Oct 2017, 18:14    Post subject:  

technosaurus

a minimal os core
that installs packages from various sources as needed

is that what you are saying ?

that would be totally awesome

i cant wait

wanderer
Back to top
View user's profile Send private message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2548
Location: UK

PostPosted: Sun 29 Oct 2017, 10:22    Post subject:  

amigo,

Cheers for the info, will download some packages from your ibiblio and have a look.. About Pkg - I'm not actually tied to any package specification (as a preference I mean)..

Pkg started life using the pre-woof format repos! ..Ideally I want it to use some *other* default repo db and pkg format (technos, yours) in future - I hate making/parsing petspec entries (in pkgs and repos) as much as anyone..

I already pondered forking Woof try re-organising its ~/.package and petspecs stuff into something closer to your tpm setup.. I too figured a "filesystem as a database" approach would be better, with sourcing of files being frequently used.. (One of the things I don't like about Petbuild is that you can't just source the scripts.. They run stuff..)

Am happy to pick your brains further (if you don't mind), but don't wanna derail this thread to make it about Pkg..

And techno - still eagerly awaiting some kind of cool demo, to help me get my head around it and have a play!

_________________
Akita Linux, VLC-GTK, Pup Search, Pup File Search
Back to top
View user's profile Send private message 
amigo

Joined: 02 Apr 2007
Posts: 2611

PostPosted: Sun 29 Oct 2017, 14:23    Post subject:  

I'm still checking out a couple of items from the list which I had not heard of. But, the link to solus led to an interesting link which I want to quickly share:
https://spdx.org/licenses/
A huge list of opensource licenses with their identifiers and a git archive of copies of all the licenses.

This is relevant as the handling of licenses is important to packaging and repos. src2pkg has a routine which can automatically detect some licenses and 'fill in the blanks' for you. The src2pkg routines for creating tpkg/tpm uses a reference set of licenses from a debian package 'common-licenses'. debian uses this package in a sort of non-legal way -instead of including a copy of the right license *within* each package the database files within the package list the license type. On installation I guess a link may be created in the docs dir for the package, which points to the copy of that lib found in common-licenses.

Early on, src2pkg could reduce the number of copies of licenses by doing just that, *but* each package would contain a copy of the license. On installation, the real copy in the prog docs would be replaced with a link to the common license, hence no redundant copies after installation. I think this way indisputably obeys the GPL & Co. requirements for shipping the license *with* any binary which gets distributed.

Ughh, sorry for mentioning licenses -I'll get back to preparing for another assault at responding to the OP more directly.
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4756
Location: Kingwood, TX

PostPosted: Mon 30 Oct 2017, 19:50    Post subject:  

@amigo - sorry for the late response. It took me a while to go through the src2pkg code again. You have done a lot of work making a universal receiver for package builds -awesome. Rather than supporting all the different formats of output though (great work with that too BTW), what I am proposing is one package format that retains enough information from the build that a fairly simple script can transform it into one or more .deb, .rpm, .tazpkg, or other package formats ... basically a universal donor package (which I am calling O negative or .o-pkg). Funny that you should mention SPDX, Rob Landley had a nice post about transitioning from developing for a universal receiver license (GPL & busybox) to a universal donor license (0BSD & toybox)

The repackager would take the spec file and other things that you are tracking in src2pkg like language files as well as others like .desktop files, /usr/share/doc/<package>/* /usr/lib/<package>/plugins/* and so forth.

With the O-pkg (O negative is the universal donor) it can be converted almost directly to a slack tgz or arch tar.xz or split up in a variety of ways and converted to other formats like .deb or .rpm

Here are some of the ways I would like to support splitting the packages (all splits optional).
DEV (includes, pkgconfig), DEV-shared (.so symlinks), DEV-static (.a)
DOC or DOC-man + DOC-info + DOC-html, etc...
NLS or multiple NLS-lang ... I even have an idea to provide google translation support for unsupported languages.
COMMON basically any data files (not binaries or libs) to allow package sharing between architectures
LIB(s) shared libs as separate package(s) ... though it would be more obvious that subversion and samba are a PITA
PLUGINS or PLUGIN-* (.so files in subdirectories of LD_LIBRARY_PATH)
BIN optionally separate packages for each binary; the main package would contain the binary that is executed in the desktop file
COPYING, AUTHORS, CHANGES and other requirements that are non-essential for program operation. When building an iso, they can be included outside of the root filesytem

Depending on how these are split the resulting packages would/could get different "depends" and "suggests". For example, abiword-BIN-3.0.2 would have either nothing or abiword-PLUGINS or many abiword-PLUGIN-* entries in "SUGGESTS" depending on whether and how plugins are split... similar for "RDEPENDS" depending oun how/whether LIB is split out and whether it was built with static or shared libs (ldd or objdump will give exact dependencies).

NEXT I need to:
* set up a table to map the various pieces/parts that are needed by each distro, but it seems that most distros support some undocumented specs, so I may need to download the database files and grep them through sort and uniq, so that I don't miss something important.
* write a script to collate all the data into simple format (like amigo, I prefer the shell variable method)
* write a script to split and convert the O-pkg to various formats using that data
* write config files for each major distro that people want support for

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
amigo

Joined: 02 Apr 2007
Posts: 2611

PostPosted: Tue 31 Oct 2017, 07:05    Post subject:  

Support for many output formats in src2pkg has led to more bloat and complexity than I ever really wanted. src2pkg is ten years+ old and mission-creep set in from the beginning. What it does best that virtually no other packager does, is discover how to build something without using any human input at all -without any build recipe.

I implemented package splitting for my package formats for the same reason as others -wanting to achieve a smaller install. BTW, Slackware does split a handful of packages -glibc, gcc, mozilla. Anyway, splitting packages brings a nightmare to the database. A really minimal install of ~50 full packages becomes 350 packages if you split them extremely small. A system with 1,000 packages becomes a system with 7,000 packages. This makes quickly retrieving info from the database much slower. Plus, the decisions you make about exactly what goes in each package can greatly affect the dependency chains for the package.

In src2pkg, I implemented purging of unwanted nls files from packages at build time. The idea was from an old standalone script which would do that on an installed system. I have since thought that maybe it would be better to distribute complete, full packages like Slackware does, but trim them at installation time using routines in the package installer/manager and following a user-configured criterion for what to remove. It does mean that packages initially would be bigger, but would save lots of Hell later. The other thing for split packages, one needs a naming scheme which distinguishes between a main/bin/exe package and a 'full' package. Currently, using src2pkg to create tpkg/tpm lets you decide at build-time whether to split packages or not -but the above naming distinction is not in there yet.

I had never seen the keyword 'enhances', before. I assume that, for instance, where package abiword 'suggests' abiword-plugins, then abiword-plugins would 'enhance' abiword. I also had never see build-dpepnds-indep. Perhaps that means an arch-independent build-time dependency? For tpkg/tpm src2pkg generates some 'suggests' by detecting scripts or other stuff included with bins in a package. Many things that could use suggests/enhances, need this info as at least partially human-input.

Vis-vis plugins, src2pkg also does some (difficult) detection there, but not with the purpose of splitting them out, IIRC. Another thing, src2pkg does automatic splitting of standard split-packages: nls, docs and dev. It also can do solibs packages -an idea taken from slacks mozilla-solibs package which provides libs-only which are used by some proggies besides mozilla. But, src2pkg can also perform special package splitting using any name/scheme you like, using a list of files to include in the package. Generally, anything src2pkg does automatically, can be completely avoided, enhanced by human input, or replaced completely with human-input.

As I understand your basic premise here, you'd have *.desktop file as normal, but with the rpm.spec file appended. Not nice when it's a package like 'webmin' which includes over 40,000 files. Further, probably only rpm.spec files would contain all the info you mention(except for your additions). What I mean is, debian packages only ship with a couple of files from the build-side -the 'control' file and any installation scripts. None of the other build-time info gets through into the installation side.

I checked out the soleus project and they basically are doing what rpm does -use a single *.spec file to hold all build-time options and directions(macros etc.), plus all the data needed to mostly fill the local package database. But soleus uses its' own specfile syntax -no less hideous than rpm-speak.

I have been pondering a long time over whether to convert the tpm package format to use a single pass-through file, but I really don't like the idea much. It would imply completely re-writing src2pkg to accomodate the new method or writing a new build tool.

I think it's much more workable to use as many input/output files as need to contain and pass the data. Imagine a build script which instead of applying a patch to some source file, it used sed with some 37-character regex to do the job. Every file should be fairly easy to read by the human AND fairly easy to machine-read.

Another thing to consider, CPU usage, RAM usage are too subjective to really be very useful -much in the same way that T2's cache file tells you how long it took to compile the thing or gentoo gives a build-duration time estimate based on the build time (at OS HQ) relative to how long it takes to build binutils. Real CPU/RAM stats would be even harder to discover and make sensible -every machine and OS would be very different.
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4756
Location: Kingwood, TX

PostPosted: Thu 02 Nov 2017, 07:19    Post subject:  

@sc0ttman you might not like my first database version, since it is loosely based on the TSV-like petspec format (though it is built from a file per package). Mostly because each file takes up ~4kb + an inode and you still have to parse file listings. However, here are a few public domain helpers for shell that could be useful in your GPL apps (Posting them separately here so I'm not accussed of violating the GPL with my own code later):

Code:
#usage nth 3 'abiword|x.x|abiword-x.x.pet|etc....' '|'
#prints abiword-x.x.pet
nth(){ #prints the nth($2) token in string($1) separated by delimiter($3)
   [ $# -eq 3 ] || return 1
   local String="$1" i="$2" IFS="$3"
   set $String
   eval printf \${$i}
}

#usage: get_vars '|' Name Ver File MoreStuff <pet.spec
#reads 1 line and sets variables according to their "column"
get_vars(){
   local IFS="$1"
   shift
   read $@
}

Note that these are even easier to parse in awk and can be searched 100s of times faster if we keep them sorted and use a binary search tree, but you need to keep the awk process running (a loop in the END{} block works). While this is also possible to do with bash arrays, puppy tends to use busybox ash which would require insanity like this: set $(cat $PackageDBfile) which may exceed the max number of arguments for distros with a large number of packages - maybe not though? Alternatively you could use eval Pkgs$count="$Line" in a while read Line loop, but that ends up making for some unreadable code.

BTW I used to think that Puppy's pet specs were horrible too, but after finding more efficient ways to parse it's tab-separated-value-like format, I realized that it (in a modified form) can be an efficient way of tracking installed packages because it is easy to parse and insert/remove/modify package entries in most programming languages.
Using sed (I think my regexes are correct)
remove entry: sed -i "/^$PackageName|/ d").
replace entry: sed -i "s/^$PackageName|.*/$PackageEntry/g")


@amigo, I cannot imagine how much work src2pkg must have been especially starting with a distro that began before there were any "standards" except to be kinda like the old at&t unix. Slackware has such arbitrary, non-intuitive ... everything. For instance:

    A - The base system.
    AP - Apps that don't require X.
    D - Dev tools.
    E - Emacs.
    F - FAQs, HOWTOs, and other docs.
    GNOME - GNOME desktop environment.
    K - Linux kernel source.
    KDE - The K Desktop Environment and Qt widget library.
    KDEI - Language support for the K Desktop Environment.
    L - System libraries.
    N - Networking programs.
    T - teTeX document formatting system.
    TCL - The Tool Command Language, Tk, TclX, and TkDesk.
    X - The base X Window System.
    XAP - X apps that aren't part of a major desktop environment.
    Y - Games.


Debian OTOH uses only slightly more intuitive categories:

    Administration Utilities - Utilities to administer system resources, manage user accounts, etc.
    Mono/CLI - Everything about Mono and the Common Language Infrastructure.
    Communication Programs - Software to use your modem in the old fashioned style.
    Databases - Database Servers and Clients.
    debian-installer udeb packages - Special packages for building customized debian-installer variants. Do not install them on a normal system!
    Debug packages - Packages providing debugging information for executables and shared libraries.
    Development - Development utilities, compilers, development environments, libraries, etc.
    Documentation -FAQs, HOWTOs and other documents trying to explain everything related to Debian, and software needed to browse documentation (man, info, etc).
    Editors - Software to edit files. Programming environments.
    Education - Software for learning and teaching.
    Electronics - Electronics utilities.
    Embedded software - Software suitable for use in embedded applications.
    Fonts - Font packages.
    Games - Programs to spend a nice time with after all this setting up.
    GNOME - The GNOME desktop environment, a powerful, easy to use set of integrated applications.
    GNU R - Everything about GNU R, a statistical computation and graphics system.
    GNUstep - The GNUstep environment.
    Graphics - Editors, viewers, converters... Everything to become an artist.
    Ham Radio - Software for ham radio.
    Haskell - Everything about Haskell.
    Web Servers -Web servers and their modules.
    Interpreters - All kind of interpreters for interpreted languages. Macro processors.
    Introspection - Machine readable introspection data for use by development tools.
    Java - Everything about Java.
    JavaScript - JavaScript programming language, libraries, and development tools.
    KDE - The K Desktop Environment, a powerful, easy to use set of integrated applications.
    Kernels - Operating System Kernels and related modules.
    Library development - Libraries necessary for developers to write programs that use them.
    Libraries - Libraries to make other programs work. They provide special features to developers.
    Lisp - Everything about Lisp.
    Language packs - Localization support for big software packages.
    Mail - Programs to route, read, and compose E-mail messages.
    Mathematics - Math software.
    Meta packages - Packages that mainly provide dependencies on other packages.
    Miscellaneous - Miscellaneous utilities that didn't fit well anywhere else.
    Network - Daemons and clients to connect your system to the world.
    Newsgroups - Software to access Usenet, to set up news servers, etc.
    OCaml - Everything about OCaml, an ML language implementation.
    Old Libraries - Old versions of libraries, kept for backward compatibility with old applications.
    Other OS's and file systems - Software to run programs compiled for other operating systems, and to use their filesystems.
    Perl - Everything about Perl, an interpreted scripting language.
    PHP - Everything about PHP.
    Python - Everything about Python, an interpreted, interactive object oriented language.
    Ruby - Everything about Ruby, an interpreted object oriented language.
    Rust - Rust programming language, library crates, and development tools
    Science - Basic tools for scientific work
    Shells - Command shells. Friendly user interfaces for beginners.
    Sound - Utilities to deal with sound: mixers, players, recorders, CD players, etc.
    Tasks - Packages that are used by 'tasksel', a simple interface for users who want to configure their system to perform a specific task.
    TeX - The famous typesetting software and related programs.
    Text Processing - Utilities to format and print text documents.
    Utilities - Utilities for file/disk manipulation, backup and archive tools, system monitoring, input systems, etc.
    Version Control Systems - Version control systems and related utilities.
    Video - Video viewers, editors, recording, streaming.
    Virtual packages - Virtual packages.
    Web Software - Web servers, browsers, proxies, download tools etc.
    X Window System software - X servers, libraries, fonts, window managers, terminal emulators and many related applications.
    Xfce - Xfce, a fast and lightweight Desktop Environment.
    Zope/Plone Framework - Zope Application Server and Plone Content Managment System.

Similar for other distros, but there is a cross-distro method of categorizing programs that could easily be adopted: specifically https://specifications.freedesktop.org/menu-spec/menu-spec-latest.html#category-registry

That spec is getting a bit dated and could stand to be coordinated with:
the icon naming specification

We've been doing this too long and things that we have just picked up along the way are not intuitive to new users. Frankly every single distro that I have tried pretty much fails on user-friendliness (pre-unity Ubuntu almost had a C). Just because we have (horrible) package management that windows was lacking until recently, we could say at least we _have_ a package manager. Windows 10 even tries to handle unsupported files by providing a selection of possible handlers or using the app store ... granted its pretty crappy but we could have exceptional file handling if we just had Mime-types in the package database. The only reason I mentioned having icons in the package database is that some people (myself included) are better at associating images with products than remembering exact names - especially with some of the jacked up names we often use in open source (GIMP, viewnior, midori, <NYB=NameYourBacronym>, etc...)

I realize how difficult it is to make something like src2pkg... that's why I would rather propose a spec that autotools and similar tools could generate to make packages more portable (and useful) across distros. However, if you have any input on how it could be patched into various build tools including src2pkg I am all ears. The last time I worked with src2pkg was in this thread, but the complexities got me too sidetracked to pursue it further at the time - its been a few years though.

@all
For a proof of concept, I am thinking at this point it would be faster and easier for me to download an entire distro's repository and grok it from there.... I have more bandwidth (1gbps) and disk space (4tb) than I do cpu time (intel celeron in a lenovo n22). Arch seems like a good candidate since their database is very verbose and has a directory for each package which will simplify the whole process. The community db is a 17Mb tarball

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
amigo

Joined: 02 Apr 2007
Posts: 2611

PostPosted: Thu 02 Nov 2017, 13:19    Post subject:  

In the OP you mentioned some repo where all stuff was dumped in one dir, oppsed to another which stored everything in non-sensically-named subdirs.

The Slackware scheme for storing sources or packages in subdirs named after groups of stuff, according to their use. Packages were presented to the installer using these categories.

debian and others use, at repo-level, alphabetically and numerically sorted subdirs. This may even be necessary when the total number of items reaches a high number. And from the top-level it's properly sorted -but you can't immediately see into the subdirs.

The simplest thing would be to put everything under one dir, but then make subdirs (or new dirs) for any sort of group/whatever you like, which contain links to the real lumped-together pacakges. The filesystem as part of your database -without having any real database at all.

Of course, one could devise some scheme of listings which could be used by a pkg-manager, etc. In fact, there are good reasons for doing this at the repo-level: ftp, http & Co. are not good at providing listings. source repos are often ftp so this is relevant. Either way, one could use these pre-made dir listings and cross-reference them anyway you like.

Still, categories or groupings may also be important at build-time. And, if used, they would probably always be involved in finding/retrieving stuff, where you have a simple URL for a repo, combined with a group-name and with the package name.

"For a proof of concept" I'm not sure what you mean here.

I do have a lot more which could be added here, but we would get off-topic fast. However, this thread has spurned me into some writing about the whole subject of Software Deployment. I have an end-to-end deployment system which is quite advanced, while keeping a really simple data structure which is easily human and machine readable. The basic structure is well thought out so that one doesn't end up having to change every build script and package because the structure or 'API' has to be changed to accommodate some further need/want. No python, no perl, no ruby, no awk -I'm not even sure, off-hand, if there's any sed in there. Virtually no parsing at all: bash-syntax vars are sourced, while long lists, like package content are in their own file.

Of course, src2pkg already handles the build-side. Most of the hard parts of the install/remove/upgrade is done -except for what hasn't yet been decided. The first level of tools are naturally cli and handle the brute installation. upgrade-pkg uses install-pkg and remove-pkg. Above them comes an also-cli package *manager* which also uses the low-level tools -but manages their use, options and output. It does not yet include the package retrieval side of things yet. Any GUI tool could easily rely on the same first-level cli tools, while incorporating whatever managing functions are wanted -the same way the cli-manager does.

You've mentioned archlinux -they seem to have started from a slackware-similar system and have evolved a lot since then -as opposed to slack itself! Their PKGBUILD scripts are like an old, defunct system for creating extra slack pkgs. Much like src2pkg build scripts, they are shell syntax and data-rich. Eventually, I might write a new src2pkg which would use similar build scripts -they have a very clean look.

I rather doubt that I would ever try to integrate everything into a single tool like rpm does -from builder to installer. As src2pkg provides witness the the build-side alone is extremely complex -most of the bulk of src2pkg is actually in the code for discovering how to configure and compile the software. The output into various units is more concise and uses only core-utils & Co. for all output formats except rpm. rpm archives have a binary header which I could never get my head around, in order to reproduce them without having rpm installed -Yuucck.

I've been thinking about proposing the tpm package format here to see if it would get traction. BK is always starting over with his projects and rewrites stuff a lot, so he might even be interested in a saner concept and method of delivery and/or construction -packages or units of software are indispensable both for system extendability, but also at build time. The definition of a core or minimal system should be made up of lists of software units. Only in this way can one come close to mastering dependency-resolution, upgrading, rollback, etc. Whether building an 'appliance' like a LiveCD, an installable system, a container, or VM image, the same units can be re-used, combined or whatever. That holds true whether they are packages, layer-able fs-images (like sfs), appdirs, apps.

I'm gonna try to get this writing done so it can be posted in one piece -in a new thread.
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4756
Location: Kingwood, TX

PostPosted: Thu 02 Nov 2017, 21:36    Post subject:  

amigo wrote:
The simplest thing would be to put everything under one dir, but then make subdirs (or new dirs) for any sort of group/whatever you like, which contain links to the real lumped-together pacakges. The filesystem as part of your database -without having any real database at all.

...


Yes, that makes total sense on the server side. Each database (maybe multiple due to main, testing, community, etc...) would have all packages in ./raw/* and links in <category>/<subcategory>/* and could even have separate directories based on license or Mimetype ... and an extra directory for icons (with some default icons and package icons). All of this could be generated from the package database - including a json file to handle table generation/sorting in the browser (with a default table for noscript). This should match the hierarchy found in the start menu and idealy have similar theme support.

On the topic of uniformity between the browser, package manager and start menu. I once played with a proof of concept that used Rox-Filer's .DirIcon for Categories and subcategories and .desktop files symlinked into each subcategory to support window managers without a start menu.

On the user end, there needs to be an easily parsable single file database for each repository for the simple fact that distros are fast approaching 6-digit packages (more if you count source). Having 1+ directory for each one and then possibly multiple files in each one and then possibly localization at ~4kb per file, you quickly end up with a distro that is mostly package management files. Contrast that with a single flat package database that could optionally be compressed with a streaming compressor like gz, xz or lz4hc and just piped through the decompressor when needed. Awk (even the busybox version) makes these kinds of files much more palatable, but even with shell scripts it takes less than 1s to parse the entire database and generate the jwm xml to create an install menu (see jwm_menu_create in jwm-tools) which only needs to be done if the database is newer than the xml file. If you have been following jwm development the last year or so, Joe added the ability to have different bindings for each mouse button, so in the regular start menu, you could have the right mouse click bring up a dialog to _Run_as, _Uninstall, _Hide, while in the install menu, the left click would bring up the install dialog and the right click could let you _Install_As, _Hide, _Create_Container, etc...

I didn't get a full idea of the tpm format from the src2pkg source code... going on 48hours without sleep so my brain's lexer is working but it gets corrupted somewhere between there and my brain's parser. Is there a good example package?... ah nevermind, they're in KISS.

OK, had a look at the gtk1 rox tpkg.
The db files look like they are intended to be used manually by more advanced users to the point where they aren't as usable for grokking.
Just at first (really tired) glance
The date being in human readable format instead of time since epoch requires extra processing to check if package A is newer than package B
The requires field appears to be the output of ldd or continually following symlinks, but you can get the true library dependencies by using objdump -x <elf_file> |grep NEEDED
I'm unsure of whether having "supplied by" in the package is necessary or even useful - what if the distro has some kind of incremental build system that builds everything with i586, then does key packages with i686+mmx, then i686+mmx+sse+sse2 and so on and so forth so that users can use the most optimized version available.
It seems like if you update a library for a security issue, then everything that was built against it would be broken, but again - not thinking clearly.
Is the /usr/apps/ directory a KISS specific thing?

_________________
Check out my github repositories. I may eventually get around to updating my blogspot.
Back to top
View user's profile Send private message Visit poster's website 
Display posts from previous:   Sort by:   
Page 1 of 2 [24 Posts]   Goto page: 1, 2 Next
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Advanced Topics » Cutting edge
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1148s ][ Queries: 13 (0.0048s) ][ GZIP on ]