PDA

View Full Version : CLI jockey help??? finding file without particular string


cafe_latte
12-14-2009, 05:18 PM
I am new to unix (Solaris 8) and have recently been assigned to updating all our webpages. On the site there are about 12000 shtml files.

My problem, the person that was maintaining the pages died and we (I) have no clue what files have been updated or not. He was in process of adding a small snippet of code, example <!-- today's date -->.
I understand that the snippet is a remarked or commented out so it doesn't appear in the live pages.

But how can I do a reverse list of pages found that don't have the above snippet.

Looking thru the various man pages and online,
the following command are close but not quite right.

find ./ -name "*.shtml" -exec egrep -v "<!-- today's date -->" {}\;

displays every line of code within a particular webpage that doesn't containg the snippet

find ./ -type f -name "*.shtml" | xargs perl -0 -n -e 'print $ARGV."\n" if /<!-- today's date -->'

displays the files that contains the snippet

Any help pointing me in the correct dir would be appreciated.

thanks,
Cafe

oesxyl
12-14-2009, 09:58 PM
I am new to unix (Solaris 8) and have recently been assigned to updating all our webpages. On the site there are about 12000 shtml files.

My problem, the person that was maintaining the pages died and we (I) have no clue what files have been updated or not. He was in process of adding a small snippet of code, example <!-- today's date -->.
I understand that the snippet is a remarked or commented out so it doesn't appear in the live pages.

But how can I do a reverse list of pages found that don't have the above snippet.

Looking thru the various man pages and online,
the following command are close but not quite right.

find ./ -name "*.shtml" -exec egrep -v "<!-- today's date -->" {}\;

displays every line of code within a particular webpage that doesn't containg the snippet
I'm sorry, can you explain more clear what you want to achive? I understand that you want to find filenames without that comment but I didn't understand what is wrong with that command line.

best regards

cafe_latte
12-14-2009, 10:22 PM
trying to locate all filed that contain specified code snippet and print list of exception files.

the CLI cmds provided work except that don't output the list of exception

cafe_latte
12-14-2009, 10:27 PM
I just want to see a list of pages that don't contain the snippet

right now I'm seeing the document being opened, displaying the line # w/raw code and that the snippet is not on that line.
this repeats til I get a buffer overflow.

oesxyl
12-14-2009, 10:31 PM
trying to locate all filed that contain specified code snippet and print list of exception files.

the CLI cmds provided work except that don't output the list of exception
try this:

find / -name "*.shtml" -exec grep -H -c "<!-- today's date -->" {} \; | grep ":0"

or replace grep ":0" with something else.

best regards

cafe_latte
12-14-2009, 10:42 PM
grep not recognizing -H

supports:
grep [ -E | -F ] [ -c | -l | -q ] [-bhinsvwx ]
-e pattern_list ... [ -f pattern_file ] [ file ... ]

oesxyl
12-14-2009, 10:47 PM
grep not recognizing -H

supports:
grep [ -E | -F ] [ -c | -l | -q ] [-bhinsvwx ]
-e pattern_list ... [ -f pattern_file ] [ file ... ]
-H is to show filename, probably is a gnu extension or something, sorry
try without it or replace with what your man page say that is "with filename".
I guess will work without, :)
-c is important here, count matches and the way you filter only lines with 0 match, :)

try to replace -H with -l

best regards

cafe_latte
12-15-2009, 07:15 AM
ok, the first grep cmd worked with the -l kinda worked, but what do I do with the second grep cmd???

oesxyl
12-15-2009, 02:05 PM
ok, the first grep cmd worked with the -l kinda worked, but what do I do with the second grep cmd???
-c will count the number of matches in each file and second grep will filter only the lines with 0 matches.

best regards

Spookster
12-15-2009, 05:55 PM
If you want to recursively search through directories of files for a particular string I use this as an alias on Solaris.

alias grepr 'find ./ -name "*" | xargs grep'

so then you can just go: grepr -in "text to search for"

But in your case you can simply do this:
find ./ -name "*" | xargs grep -in "<!-- today's date -->"

This should return a list of all files with that string.

Never mind I reread your post and you want the opposite.

cafe_latte
12-15-2009, 07:05 PM
tried "Spookster's" code and it closer but still not it.

Its outputting the file w/line# where the code snippet is present.

cafe_latte
12-15-2009, 07:11 PM
realworld example, I trying to locate all the files that do NOT have '<iframe src="http://www.servername.com/redirects/' and output a list of the files.

oesxyl
12-16-2009, 04:58 AM
realworld example, I trying to locate all the files that do NOT have '<iframe src="http://www.servername.com/redirects/' and output a list of the files.
what is the output of this:

find . -name '*.shtml' -exec grep -lc '<!- today's date -->' {} \;


it give you the filenames and number of matches in each file?

best regards

cafe_latte
12-16-2009, 05:29 AM
here's the exact cmd & output

find ./ -name "*.shtml" -exec grep -lc 'com/redirects.' {} \;
0
1
0
2
0

oesxyl
12-16-2009, 05:48 AM
here's the exact cmd & output

find ./ -name "*.shtml" -exec grep -lc 'com/redirects.' {} \;
0
1
0
2
0
and without -l?

find ./ -name "*.shtml" -exec grep -c 'com/redirects.' {} \;

if both, filenames and number of matches are in the output the rest will be simple.

best regards

cafe_latte
12-16-2009, 07:13 AM
the -c gave same output. but how do I match the number to a filename?

I created a script
#!/bin/sh
find ./ -name "*.shtml" > ./listall.txt
find ./ -name "*.shtml" -exec grep -ln 'com/redirects.' {} \; > ./stringlist.txt
comm -23 ./listall.txt ./stringlist.txt > ./x123.txt

oesxyl
12-16-2009, 07:38 AM
the -c gave same output. but how do I match the number to a filename?

I created a script
#!/bin/sh
find ./ -name "*.shtml" > listall.txt
find ./ -name "*.shtml" -exec grep -ln 'com/redirects.' {} \; > ./stringlist.txt
diff
comm? I don't know if you have comm on solaris.


comm -3 listall.txt ./strinlist.txt

comm -3 will surppress lines common in both files then you can use awk or mawk to output only first column or second.

I don't know how this will work on solaris:

#!/bin/sh

for i in `find ./ -name '*.shtml'` ; do
echo "${i}:`grep -c $i`" >> ./mylist.txt
done
# you can put next line here or better use it outside of this script.
grep ":0" ./mylist.txt


best regards

oesxyl
12-16-2009, 07:42 AM
the -c gave same output. but how do I match the number to a filename?

I created a script
#!/bin/sh
find ./ -name "*.shtml" > ./listall.txt
find ./ -name "*.shtml" -exec grep -ln 'com/redirects.' {} \; > ./stringlist.txt
comm -23 ./listall.txt ./stringlist.txt > ./x123.txt
yes, this must work, :)
next step is to use a version control software , like cvs/svn/rcs or others, :)

best regards