regexp match <script></script>

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

regexp match <script></script>

Denis Perelyubskiy
hello,

how do I match <script>(something here)</script>, where (something here)
does not contain any other <script> tags and can span multiple lines?

along the same lines, I tried:
<script\_.*<\/script>, but did not know how to make the match stop on
the first <\/script> (i tried \(<\/script>\)\{-}, but somehow that did
not do what I wanted.

Also, I did not know how to exclude <script> from the \_.* match. I
tried changing this to something like this:
<script\_.*\(<script>\)\{0}\_.*<\/script>

Perhaps I am going about it completely the wrong way. Is this even
possible?

thanks!

-d
--
// mailto: Denis Perelyubskiy <lists at overwhelmTAKECAPITALSOUT dot net>
// icq   : 12359698

Reply | Threaded
Open this post in threaded view
|

Re: regexp match <script></script>

Tim Chase-2
> along the same lines, I tried: <script\_.*<\/script>, but did
> not know how to make the match stop on the first <\/script> (i
> tried \(<\/script>\)\{-}, but somehow that did not do what I
> wanted.

You were very close...you likely wanted

        <script\_.\{-}<\/script>

as the "*" is greedy and will slurp up all the "\_." items
(essentially from your very first <script> through your very last
</script> tag)

-tim





Reply | Threaded
Open this post in threaded view
|

Re: regexp match <script></script>

Gary Johnson
In reply to this post by Denis Perelyubskiy
On 2005-09-27, Denis Perelyubskiy <[hidden email]> wrote:

> hello,
>
> how do I match <script>(something here)</script>, where (something here)
> does not contain any other <script> tags and can span multiple lines?
>
> along the same lines, I tried:
> <script\_.*<\/script>, but did not know how to make the match stop on
> the first <\/script> (i tried \(<\/script>\)\{-}, but somehow that did
> not do what I wanted.
>
> Also, I did not know how to exclude <script> from the \_.* match. I
> tried changing this to something like this:
> <script\_.*\(<script>\)\{0}\_.*<\/script>
>
> Perhaps I am going about it completely the wrong way. Is this even
> possible?

So close.

    <script\_.\{-}<\/script>

From ":help /\{-":

    \{-} matches 0 or more of the preceding atom, as few as possible

So in the pattern above, "\{-}" will cause the previous atom, "\_.",
to be matched as few times as possible before the following part of
the pattern, "<\/script>", is matched.

HTH,
Gary

--
Gary Johnson                 | Agilent Technologies
[hidden email]     | Wireless Division
                             | Spokane, Washington, USA
Reply | Threaded
Open this post in threaded view
|

Re: regexp match <script></script>

A.J.Mechelynck
In reply to this post by Denis Perelyubskiy
----- Original Message -----
From: "Denis Perelyubskiy" <[hidden email]>
To: "Vim" <[hidden email]>
Sent: Tuesday, September 27, 2005 10:19 PM
Subject: regexp match <script></script>


> hello,
>
> how do I match <script>(something here)</script>, where (something here)
> does not contain any other <script> tags and can span multiple lines?
>
> along the same lines, I tried:
> <script\_.*<\/script>, but did not know how to make the match stop on
> the first <\/script> (i tried \(<\/script>\)\{-}, but somehow that did
> not do what I wanted.
>
> Also, I did not know how to exclude <script> from the \_.* match. I
> tried changing this to something like this:
> <script\_.*\(<script>\)\{0}\_.*<\/script>
>
> Perhaps I am going about it completely the wrong way. Is this even
> possible?
>
> thanks!
>
> -d
> --
> // mailto: Denis Perelyubskiy <lists at overwhelmTAKECAPITALSOUT dot net>
> // icq   : 12359698

Including the tags:
    /<script>\zs\_.\{-}\ze<\/script>
        <script>
            matches itself
        \zs
            set start-of-match here
        \_.
            anything including an end-of-line
        \{-}
            repeated zero or more times, as few as possible
        \ze
            set end-of-match here
        <\/script>
            matches </script>

However, the above pattern doesn't handle mismatched or embedded tags: it
matches from any opening tag to the first closing tag that follows it.


Best regards,
Tony.


Reply | Threaded
Open this post in threaded view
|

Re: regexp match <script></script>

A.J.Mechelynck
----- Original Message -----
From: "Denis Perelyubskiy" <[hidden email]>
To: "Tony Mechelynck" <[hidden email]>
Sent: Tuesday, September 27, 2005 10:45 PM
Subject: Re: regexp match <script></script>


> Thanks, Tony. My SCRIPT tags are all very well formed, so I can squeak
> by :)
>
> denis
[...]

My pleasure. If there can be arguments in the opening tag (like
"language=JavaScript" or something) you'll have to match that too if you
want the match to exclude the tags:

    /<script\>\_.\{-}>\zs\_.\{-}\ze<\/script>

The only new thing here is \> which matches an end-of-word.


Best regards,
Tony.