Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Mon 27 Jan 2020, 11:57
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
Local Test Repo Generation: Download Doc File,
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 1 [4 Posts]  
Author Message
s243a

Joined: 02 Sep 2014
Posts: 2381

PostPosted: Fri 27 Dec 2019, 20:01    Post subject:  Local Test Repo Generation: Download Doc File,
Subject description: Run & Maybe Install Web Server, Add test repo to sources.list
 

1.0 Introduction

This code is primarily but works.As noted in the title it:
1. Generates a Test Repo and then
2. Adds the test repo to pkg

Part of this process involves running a webserver. You can specify both the webserver to run as well asfallback webservers to run (or install) if for some reason the specified web server won't run.

The actual fallback logic might need more work but currently the webserver specified is "busybox httpd", which I covered in previous post. Since the puppy verson of busybox has this webserver, no other webservers should be run (or installed) unless the user changes the specified web server in the script.

I created this code to test the repo update scripts in Sc0tmann's package manager (i.e. pkg) and so I'll also want to try it with other web servers, just as a means to test package installation, while at the same time testing the repo update scripts. The code which is the subject of this thread is a good demonstration on what can be done with "pkg".

2.0 Cherry Picking Items for the Test Repo

The code to select the repo items has three parts:
2.1. Identify the items of interest
2.2. Randomly pick a few of the items of interest
2.3. Filter the Repo DB Doc File to select only those randomly selected items of interest.

After the Repo Doc File has been filtered, then"
3.1 download only the items in the fitered repo db doc file
3.2 start the web server
3.3 add the new repo to pkg. This adds the item to ~/.pkg/sources, ~/.pkg/sources-all and /ect/apt/sources.list and then converts the repo doc file into puppy format.

There are two scripts which are part of package to convert the repo into puppy format. They are ppa2pup and ppa2pup_gawk. The latter gawk version is many times faster for a large repo but not necessarily faster if there only a few items. The gawk version is part of the main branch but not yet part of an official release of pkg.

2.1 Cherry Picking Items of Interest
As noted above the first step is to identify the items of interest for testing. In our case we are interested in packages which include the epoch number in the version (see manpage debversion). Historically, the puppy package manager has stripped the epoch number from the repo database but this information could be useful for version comparison. The following awk program extracts the first three fields from a puppy "repo db doc file" (e.g. /var/packages/Packages-ubuntu-bionic-main) but only for the packages of interest, which are the ones that have a colon in their version number. THe colon means that the version number includes the epoch.

Code:

  AWK_PRG_1=\
'BEGIN {FS="|"; OFS="|"}
{ if ($1 ~ /^[^|]+:[^|]+$/  ){
    print $1 "|" $2 "|" $3 #We might want to use some of these other fields for a different application
 }}'


2.2 Randomly Pick a few of the Items of Interest for Testing

As noted above step two is to randomly pick some of these packages of interest and pragmatically generate AWK code to select only these randomly picked items of interest.

Code:

function echo_filter_line(){
    read a_pkg_name
    echo "pkg_filter[\""$a_pkg_name"\"]=\"true\""
}
  while read pkg_record; do
    echo "$pkg_record" | cut -f2 -d'|' | echo_filter_line
  done < <( cat $REPO_DB_DOC_FILE_in | awk "$AWK_PRG_1" ) \
  | sort -R | head -n 3 >> "$filter_lines_path"


The random packages of interest are selected in the above code by taking the first three rerecords of a random sort:
Code:

 sort -R | head -n 3


Rather than output just the package name, we output an array which includes all the packages which we want to include in our filtered "repo db doc file" This array is an associative array (AKA a dictionary or in some cases as a hashmap). Typically this type of data structure has a fast lookup. The keys are simply the package name. If the array has a key equal to the package name then we print the result. The purpose of the code generation here is ironically for readability, In-lining the data like this is more readable when the amount of data is small. For large data sets it would be better for the program to read the data from an external file.

2.3 Filter the Repo DB doc file for only the items of interest.

Here is an example of the code generated by my script:

Code:

#!/usr/bin/gawk -f
function init_filter(){
pkg_filter["libreoffice-l10n-nso"]="true"
pkg_filter["libmythes-dev"]="true"
pkg_filter["libgcc1-ppc64el-cross"]="true"
}
function filter_accept(s){ #Return true if we are to print the result
  if ( pkg_filter[s] == "true" ){
    return "true"
  } else {
    return "false"
  }
}
BEGIN {init_filter()}
/^Package:/ { PKG=$0; sub(/^Package: /,"",PKG); FILTER_ACTION=filter_accept(PKG)}
{if (FILTER_ACTION == "true"){
    print $0
  }
}


Lines such as:
Code:

pkg_filter["libreoffice-l10n-nso"]="true"


were generated by the previously mentioned function "echo_filter_line()" and this output is written to a file. The file is then read back into a string "representing the program" with the following code:
Code:

$(cat $filter_lines_path)


Depending on the options you can execute the program as a string or have it first written to a file. Executing it as a string might be faster but if you write it to a file then it is easier to debug.

3.1 download only the items in the fitered repo db doc file

The code to download only the filtered items is quite simple.

Code:

  AWK_PRG_3=\
'/^Filename:/ {
    system("wget --quiet \"$repo_url_in\" -O \"" RROOT "/" FPATH "\" 1>/dev/null")
    }'
   cat "${doc_path}/Packages" | awk -v "RROOT=\"$repo_root_path\"" \
    "$AWK_PRG_3"

This AWK code is such that it only processes lines that start with "Filename". These lines give the path of the file to download. To download the file the AWK code calls an external function by using AWKs system command, which we use to call wget. The repo root on the local file system was done as an input variable to awk and the repo url was inclined. Whether we inline or alternatively use the -v (for variable) option is somewhat arbitrary.

3.2 start the web server

Given that there are fallback webservers both to run and/or install the full code to start the web server is quite complicated. But in my example the basic code to start the seb server is as follows:

Code:

httpd -h /var/www/html


Currently the code uses a configuration file (the -c option) but the actual confuration file is empty. Also as mentioned in my previous post, to display the contents of a directory with busybox httpd, requires cgi. Instruction on how to do this are in my previous post.

3.3 add the new repo to pkg

The code to add a new repo to package is straight forward. For instance on Debian systems the node.js repo can be added as follows:
Code:

pkg add-repo https://deb.nodesource.com/node_9.x stretch main


As mentioned above there are two alternative functions that package uses to add a Debian repo. They are ppa2pup and ppa2pup_gawk. In the test code you choose which one you want to use:

Code:

TEST_CMD=ppa2pup_gawk
...
    ( exec <<< "$repo_name_out"
      pkg add-repo "$repo_url_out" "$distro_ver_out" "$stream_out" )
...
case "$TEST_CMD" in
ppa2pup) PKG_PPA2PUP_FN=ppa2pup pkg --repo-update; ;;
ppa2pup_gawk) pkg --repo-update; ;;
esac


Conclusion

This coding exercise has created for me some examples on how I can filter a Debian repo and create a mirror of the filtered packages automatically with sc0ttman's package manager (i.e. pkg). It will be useful for testing sc0ttman's package manager and I will also be able to adapt the code to other applications. The biggest weakness is perhaps the complexity on using fallback webserver packages but I think this fallback approach will be useful for testing and I think that there are other things that I can learn from these fallback techniques.

_________________
Find me on minds and on pearltrees.
Back to top
View user's profile Send private message Visit poster's website 
s243a

Joined: 02 Sep 2014
Posts: 2381

PostPosted: Sat 28 Dec 2019, 01:46    Post subject:  

The above post is now ready for reading.
_________________
Find me on minds and on pearltrees.
Back to top
View user's profile Send private message Visit poster's website 
musher0

Joined: 04 Jan 2009
Posts: 14724
Location: Gatineau (Qc), Canada

PostPosted: Sat 28 Dec 2019, 02:48    Post subject:  

For now this is just a note to myself. TWYL.
_________________
musher0
~~~~~~~~~~
Je suis né pour aimer et non pas pour haïr. (Sophocle) /
I was born to love and not to hate. (Sophocles)
Back to top
View user's profile Send private message 
sc0ttman


Joined: 16 Sep 2009
Posts: 2798
Location: UK

PostPosted: Sat 28 Dec 2019, 08:25    Post subject:  

In case this is of interest to anyone (it's somewhat related), there is lots of
Bash CGI related stuff here:

http://murga-linux.com/puppy/viewtopic.php?t=115252

_________________
Pkg, mdsh, Woofy, Akita, VLC-GTK, Search
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 1 [4 Posts]  
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0864s ][ Queries: 11 (0.0142s) ][ GZIP on ]