[closed]do you already have a xml to comma delinated script?

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Post Reply
Message
Author
scsijon
Posts: 1596
Joined: Thu 24 May 2007, 03:59
Location: the australian mallee
Contact:

[closed]do you already have a xml to comma delinated script?

#1 Post by scsijon »

before I start to try to build it! I am NOT a good base coder, although I can adapt existing code fairly well.

Does anyone already have a script that can take a xml file and turn it into a comma deliniated file. I'm starting on a opensuse2ppm script similarly to barryk's mageia2ppm and would like to start with a automated step1 as there are 19098 packages in the opensuse set at the moment(, up from 18880 last month). I'm actually thinking of stripping out lines we don't need as step0 as the source filesize is some 76meg and that should make it quicker to process the rest.

source xml is in this format if anyones interested:

Code: Select all

<package type="rpm">
    <name>844-ksc-pcf</name>
    <arch>noarch</arch>
    <version epoch="0" ver="19990207" rel="784.1.1"/>
    <checksum type="sha256" pkgid="YES">ec26988a001df41bd1752aeb035608edbf1ef5ec646569d63a7d938228a6ff4d</checksum>
    <summary>Korean 8x4x4 Johab Fonts</summary>
    <description>Korean 8x4x4 johab fonts.</description>
    <packager>http://bugs.opensuse.org</packager>
    <url>http://www.debian.or.kr/~cwryu/archive/fonttools/</url>
    <time file="1319310585" build="1319310562"/>
    <size package="2592518" installed="4382509" archive="4403032"/>
    <location href="noarch/844-ksc-pcf-19990207-784.1.1.noarch.rpm"/>
    <format>
      <rpm:license>Public Domain, Freeware</rpm:license>
      <rpm:vendor>openSUSE</rpm:vendor>
      <rpm:group>System/X11/Fonts</rpm:group>
      <rpm:buildhost>build25</rpm:buildhost>
      <rpm:sourcerpm>844-ksc-pcf-19990207-784.1.1.src.rpm</rpm:sourcerpm>
      <rpm:header-range start="872" end="39087"/>
      <rpm:provides>
        <rpm:entry name="locale(xorg-x11:ko)"/>
        <rpm:entry name="844-ksc-pcf" flags="EQ" epoch="0" ver="19990207" rel="784.1.1"/>
      </rpm:provides>
      <rpm:requires>
        <rpm:entry name="perl" pre="1"/>
        <rpm:entry name="/bin/sh"/>
        <rpm:entry name="aaa_base" pre="1"/>
        <rpm:entry name="/bin/sh" pre="1"/>
      </rpm:requires>
    </format>
  </package>
regards
scsijon
Last edited by scsijon on Tue 19 Feb 2013, 03:30, edited 1 time in total.

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#2 Post by amigo »

Yeah, before trying to build -parsing xml is 'heavy lifting'. A search would have gotten you lots of hits:
https://duckduckgo.com/?q=convert+xml+to+CSV

scsijon
Posts: 1596
Joined: Thu 24 May 2007, 03:59
Location: the australian mallee
Contact:

#3 Post by scsijon »

amigo wrote:Yeah, before trying to build -parsing xml is 'heavy lifting'. A search would have gotten you lots of hits:
https://duckduckgo.com/?q=convert+xml+to+CSV
thanks amigo, had already done a search via sourceforge and had a look at the results (all 52 pages of them), found everything but what actually did what was wanted, the few that said they did were windows, mac, or required such a lot of additional packages (for a puppyan) that it didn't make sense to use. I shall try your link and see if I can do any better.

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#4 Post by amigo »

Using xslt would be the most obvious:
http://stackoverflow.com/questions/2516 ... ile-to-csv

Other standard XML tools are: XMLStarlet, xsltproc and perl xpath


This is interesting:
http://www.freesoftwaremagazine.com/art ... ties_linux

http://stackoverflow.com/questions/8935 ... ml-in-bash

musher0
Posts: 14629
Joined: Mon 05 Jan 2009, 00:54
Location: Gatineau (Qc), Canada

#5 Post by musher0 »

Hello, scsijon.

If you're into java, you may want to try one of these :

http://www.wenzlaff.de/xmltocsv.html
or
http://code.google.com/p/xml2csv-conv/

Best regards.

musher0
musher0
~~~~~~~~~~
"You want it darker? We kill the flame." (L. Cohen)

seaside
Posts: 934
Joined: Thu 12 Apr 2007, 00:19

#6 Post by seaside »

scsijon.,

You may find the following xml utilities of interest. Below is a quote from the simple-icon-tray thread weather program made with xml-printf.
Since there are libraries for many languages to handle xml code parsing, I was wondering if any xml help existed for shell programs in Linux and ran across this -
http://xml-coreutils.sourceforge.net/

It's a set of utilities aimed at emulating the standard shell text tools like sed, tr, cat, printf, find, etc... but specifically for xml.

I compiled xml-printf which can be used to capture the content between tags and made a tray icon program for weather.
Here's a download link for tweather.pet, which contains xml-printf.

http://murga-linux.com/puppy/viewtopic. ... h&id=63046

Regards,
s

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#7 Post by technosaurus »

xml is the most ridiculous format, I've no idea how it caught on.

That being said, if you are dealing with one tag per line, awk is pretty useful

awk '
BEGIN{FS="<|>"}
/<name>/{name=...}
/<arch>/{pkgs[$name][arch]=...}
/ var=/{pkgs[$name][var]=...}
END{}
' file
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

scsijon
Posts: 1596
Joined: Thu 24 May 2007, 03:59
Location: the australian mallee
Contact:

#8 Post by scsijon »

thank you all, and I agree technosaurus.

Unfortunately it's the best of two worlds for opensuse's packages! The other requires three sql files to be opened and integrated, before extraction for ppm and together they build 300meg+.

I think I have a lead from one of amigo's new links for something that will work easily. Thank you.

But i'm not marking this solved quite yet!

jamesbond
Posts: 3433
Joined: Mon 26 Feb 2007, 05:02
Location: The Blue Marble

#9 Post by jamesbond »

I'm a little late to the game. But this is the tool that I use for getting stuff out of xml data: http://www.ofb.net/~egnor/xml2/. Converts XML into a flat file format which you can grep sed awk on.
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]

scsijon
Posts: 1596
Joined: Thu 24 May 2007, 03:59
Location: the australian mallee
Contact:

#10 Post by scsijon »

thanks all,

I have changed methods and :oops: feel more than a bit of a fool :oops: after it was pointed out to me that all I need to do is modify the rpm2ppm script to match the opensuse format.

Post Reply