Go Back   CodingForums.com > :: Server side development > PHP

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 02-19-2006, 12:30 AM   PM User | #1
boohiss
New to the CF scene

 
Join Date: Feb 2006
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
boohiss is an unknown quantity at this point
Question Shorten text and HTML 4.01 strict

I asked this question in the HTML/CSS area a while back, and didn't get a single idea of even where to start with this problem. Basically, I'm shortening text with a PHP function to display on a blog front page, and it's cutting off ending HTML tags (like </blockquote> or </ul>, etc).

It is displaying just fine, but it can cause my page not to validate under W3C HTML 4.01 strict. I'm wondering if there's any way to fix this, prevent it, or even just an idea of how to approach the problem.

Here's my original post:

http://codingforums.com/showthread.p...highlight=4.01
boohiss is offline   Reply With Quote
Old 02-19-2006, 01:02 AM   PM User | #2
firepages
Super Moderator


 
Join Date: May 2002
Location: Perth Australia
Posts: 3,890
Thanks: 5
Thanked 79 Times in 78 Posts
firepages will become famous soon enough
Most 'teasers' if autogenerated are posted without formatting (strip_tags() etc)
& I think half of the reason for this is that its not straightforward to do what you want to do & the other half is that the formatting used within a page may or may not work in the context of a small `teaser`.

You could simply store a seperate field (in your db or however you are storing) just for the teaser since you often may want to summerize the main contents (rather than grab the first $x words)

To try and parse the content and repair is not that easy since there may be nested tags etc, e.g. there is no regex one-liner to cover that.
You could possibly use a third party sanitizer like htmlTidy but that seems overkill to me.

a seperate field for the teaser or strip_tags() would be my choice (& in that order)
__________________
resistance is...

MVC is the current buzz in web application architectures. It comes from event-driven desktop application design and doesn't fit into web application design very well. But luckily nobody really knows what MVC means, so we can call our presentation layer separation mechanism MVC and move on. (Rasmus Lerdorf)
firepages is offline   Reply With Quote
Old 02-19-2006, 03:22 AM   PM User | #3
boohiss
New to the CF scene

 
Join Date: Feb 2006
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
boohiss is an unknown quantity at this point
Quote:
Originally Posted by firepages
a seperate field for the teaser or strip_tags() would be my choice (& in that order)
Thanks for your reply, firepages.

I had thought of the strip_tags option, and decided I didn't want to lose anchor tags, bold, italic, etc, in the 'teaser' as you call it.

I hadn't thought of storing a teaser in a seperate field. I suppose that's the DB designer in me putting blinders on to anything that even comes close to data duplication. I'm not sure the extra work involved here (not just a one time cost like some magical function would be) would be worth having my page 4.01 compliant.

I was also considering that since I'm parsing every character anyway with my ShortenText function, why not simply keep track of what tags are 'open' (recursively maybe) and simply close remaining open tags when I parse a long enough teaser string?
boohiss is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 01:56 PM.


Advertisement
Log in to turn off these ads.