Page 1 of 1

How to manipulate strings with bash?

Posted: Fri 27 Nov 2009, 20:43
by 2byte
I'm trying to teach myself bash but I'm stumped on this and I hope some kind soul can help me out. I can't figure out how to extract just the file names that have the extension ".txt" from strings like these.

FILES='hoover.dat mon.txt prev.dat fin.txt do.txt'
FILES='hoover.dat cal.dat mon.txt prev.dat fin.txt binding.old do.txt'

The "FILES=" is part of the string. The number and sequence of the file names vary, they are always enclosed in single quotes, and they are separated by a single space.

I have tried every combination of the bash string operations from here http://tldp.org/LDP/abs/html/refcards.html#AEN22098 that I can think of but I can't get it.

Thanks for your attention
2byte

Posted: Fri 27 Nov 2009, 23:28
by Pizzasgood
If you're willing to cheat and use things besides pure bash, you can do this:

Code: Select all

# STRING="FILES='hoover.dat mon.txt prev.dat fin.txt do.txt'"
# echo $STRING | tr " '" '\n' | grep "\.txt$"
mon.txt
fin.txt
do.txt
Otherwise I suppose you could do it the hard way by locating the position of the first ' and the first space. Then extract the text between them. Locate the final . and look at the text after it and make sure it equals txt. If so, keep the string, otherwise chuck it. Locate the position of the next space in the original string and repeat until you run out of spaces. After the final space, look between the last space and the final '.

Posted: Sat 28 Nov 2009, 01:15
by 2byte
Thank you so much. I think I actually deciphered that code :)
$STRING | tr " '" '\n' | grep "\.txt$"
STRING is piped to tr where the space and ' characters are translated into newline characters, then it's piped to grep and grep returns each line that contains ".txt". Correct?

One more question, if you don't mind. What is the signifigance of the escaped dot and $ in your pattern for grep? Playing with the code, grep ".txt", returns the same output.

Code: Select all

#!/bin/bash
STRING="FILES='hoover.2fs mon.txt prev.dat fin.txt do.txt'"

array="`expr "$STRING" | tr " '" '\n' | grep ".txt"`"
echo $array

for i in ${array[*]}; do
	echo ${i}
done

#mon.txt fin.txt do.txt
#mon.txt
#fin.txt
#do.txt 
Thanks again, and I hope you had a nice supper :)
.

Posted: Sat 28 Nov 2009, 05:18
by Pizzasgood
The trailing $ stands for "end of line". It ensures that grep only returns filenames that end in .txt (as opposed to things like file.txt.log.gz).

As for the dot, it's because grep considers dot to be a wild card that stands for any single character. So a naked dot would let the pattern match files like file.blatxt. So I escaped it so that it would be treated as a literal period.

Grep uses regular expressions. Really useful things. (Sed and Awk also use regex, as does Perl.)

Posted: Sat 28 Nov 2009, 09:23
by amigo
Bash can emulate simple grepping using 'case':

Code: Select all

#!/bin/bash

FILES='hoover.dat mon.txt prev.dat fin.txt do.txt'

for FILE in $FILES ; do
	case $FILE in
		*.txt) echo $FILE ;;
	esac
	# this also works but is less accurate:
	if [[ $FILE =~ '.txt' ]] ; then
		echo $FILE
	fi
done

Posted: Sat 28 Nov 2009, 15:31
by 2byte
Wow. So much good information in just a couple of posts. That's a great link for regex, thanks Pizzasgood. Maybe now I can make sense of the regex Hieroglyphics I see in so many scripts.

Thank you too Amigo. Your example actually makes more sense to me since I learned to program with basic so many years ago. I almost think that's a disadvantage. When it comes to programming I tend to think in basic, and was looking for the equivalents to instr, mid$, left$, right$, etc. It's hard to unlearn.
.

Posted: Sat 28 Nov 2009, 16:49
by amigo
I like doing things in 'pure shell' when it's possible and not too fancy or messy looking. In some cases it will run faster than calling external programs to do the job because of latency issues(program startup time). For really large or complex jobs you are usually better of using a separate tool though.

How to manipulate strings with bash?

Posted: Sun 29 Nov 2009, 01:53
by efiguy
Hello all,

This is very interesting, especially for some graphic programs that renumber (and may add to an existing name) images in the generation of thumbnails, and for use with other camera format naming.

Are the examples above the "heart" of the sort mechanism, there must be a lot more of unstated code involved to deal with eventual output?

Curiosity, trying to learn by actual doing, is bash and/or regex
capable of performing tasks like these:

a) How is the code script used?, is it placed within a file located in the directory of all the associated "file types" in order to be run?

b) does it seperate the "txt" file from the others only displaying just those files?

c) does it generate a directory list?

d) could it create directories, copy and move files into those directories?

e) could it create a list to be used in "href" tag production for indexes?

It's OK to refer me to other sites as a "newbie" have bookmarked the Regex link <;)

Thanks all
Jay

Posted: Sun 29 Nov 2009, 18:15
by 2byte
efiguy,
a) How is the code script used?, is it placed within a file located in the directory of all the associated "file types" in order to be run?
It was just an exercise to get a grip on bash string handling, in preparation for a utility I want to write. To answer the rest of your questions, yes, bash could be used to do all of those things.

Speaking for myself, the basic bash commands are not the real difficulty. Executing the commands and putting the desired results into variables and working with those is the hard part. In other words, writing a script :). Here's a few links that are proving to be helpful.

An A-Z Index of the Bash command line for Linux.
http://ss64.com/bash/
Bash Guide for Beginners
http://tille.garrels.be/training/bash/index.html
Linux Shell Scripting Tutorial
http://steve-parker.org/sh/intro.shtml
.

Posted: Sun 29 Nov 2009, 19:48
by amigo
To make something really useful out of that code example, you'd need to substitute the input data with some useful data. For instace, to filter all the files and subdirs in the current directory, you could use:

Code: Select all

#!/bin/bash 

for FILE in * ; do 
   case $FILE in 
      *.txt) echo $FILE ;; 
   esac 
   # this also works but is less accurate: 
   if [[ $FILE =~ '.txt' ]] ; then 
      echo $FILE 
   fi 
done
And to do something useful with the filtered items, you'd use some command besides(or in addition to the 'echo $FILE' command.

How to manipulate strings with bash?

Posted: Mon 30 Nov 2009, 00:56
by efiguy
Thank you, those are spectacular info sites, I have a lot to explore !!
jay

Posted: Mon 30 Nov 2009, 03:44
by sunburnt
These guys are great aren`t they efiguy?

This is where I started learning Bash scripting 4 years ago!

Posted: Fri 18 Dec 2009, 07:09
by shaily
Hi,
In order to facilitate opening my programming projects in vim, I have made a little bash script which starts in my projects directory, asks for a filename and then runs vim, to avoid having to cd all the way to it every time. The only thing I haven't been able to script is resizing the window to a more programming-friendly size.As I don't know much about bash scripting I need your help.
Thanks.

Posted: Fri 18 Dec 2009, 15:24
by seaside
shaily wrote:Hi,
In order to facilitate opening my programming projects in vim, I have made a little bash script which starts in my projects directory, asks for a filename and then runs vim, to avoid having to cd all the way to it every time. The only thing I haven't been able to script is resizing the window to a more programming-friendly size.As I don't know much about bash scripting I need your help.
Thanks.
shaily,

You can control the window size by the "vimrc" file. You could put this in.

Code: Select all

set lines=50 columns=100
or -

Code: Select all

if has("gui_running")
  " GUI is running or is about to start.
  " Maximize gvim window.
  set lines=99999 columns=99999
else
  " This is console Vim.
  if exists("+lines")
    set lines=50
  endif
  if exists("+columns")
    set columns=100
  endif
endif
Probably some other ways that I don't know about as well :D

cheers,
s