regex help

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

regex help

Muskoka Auto Parts Limited
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi All
        I'm trying to work out if it's possible to refer to 'the previous  
line' in a regex.
e.g.  if the first 8 characters of a line are blank,
    ^\s{8}
replace them with the 8 characters at the start of the previous line.

Ideally it would handle a line a time, thus multiple blank line  
starts would be filled in with the last non blank start.

Hoping that made sense...
Brian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Verify this email or encrypt your email for free - see gnupg.org

iD8DBQFGSy1kGnOmb9xIQHQRAuXcAKCvQL9kHdr/xKhkgp5jve6NPNJqgQCgmQRw
LcscyQAMfFMdMQadBAg8vmQ=
=1lwZ
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: regex help

Muskoka Auto Parts Limited
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 16-May-07, at 12:49 PM, Gene Kwiecinski wrote:

> :g/^\(.{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/


Well,
:g/^\(.\{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/
        ^

works, but I'd have to run it 12 times if there are twelve blanks  
after the filled in line.

I suppose global picks all the lines, then operates on them.

That's a start - thanks!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Verify this email or encrypt your email for free - see gnupg.org

iD8DBQFGS0gYGnOmb9xIQHQRAg04AKCiPBKmNTc1MhA7r9IeJG0dC4lPwwCgsH0l
k4/ExrBsVzsfV1K3ktfEo08=
=sbTc
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

RE: regex help

Gene Kwiecinski
>>:g/^\(.{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/

>Well,
>:g/^\(.\{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/
>        ^
>works, but I'd have to run it 12 times if there are twelve blanks  
>after the filled in line.

Hm?  Not sure why you escaped the '{'.  Apparently didn't need to after
the "\s".


>I suppose global picks all the lines, then operates on them.

Ideally, yeh.  Wondered if the op for lines 3-4 would fill in the chars,
then the op would also work on lines 4-5, or if it'd still think the 1st
chars be blank and skip to lines 5-6 as matching the pattern.


>That's a start - thanks!

No worries.
Reply | Threaded
Open this post in threaded view
|

Re: regex help

Tim Chase-2
In reply to this post by Muskoka Auto Parts Limited
>> I'm trying to work out if it's possible to refer to 'the previous  
>> line' in a regex.
>> e.g.  if the first 8 characters of a line are blank,
>>     ^\s{8}
>> replace them with the 8 characters at the start of the previous line.
>>
>> Ideally it would handle a line a time, thus multiple blank line  
>> starts would be filled in with the last non blank start.
>
>
> Something like the following (untested) regexp should do the trick:
>
>    :%s/^\(.\{8}\).*\n\zs\s\{8}/\1
>
> or
>
>    :%s/^\(.\{8}\)\(.*\n)\s\{8}/\1\2\1

Playing around with this a little more ("how did you spend your
morning, dear?"  "Oh, just playing around with some regular
expressions for a guy on a email list, one could expand it from 8
spaces to N spaces with something like

  :g/^\s\+/k a|?^\S?y|'a|s/^\s\+/\=strpart(@", 0,
strlen(submatch(0)))

It tromps on your scratch register, and your "a" mark (the "k a"
and "'a" bits) but it is a little more flexible.

If you need multiples of a given number of spaces, you could
alter it to

  :g/^\(\s\{4}\)\+/...

which would be multiples of 4 whitespace characters.

-tim


Reply | Threaded
Open this post in threaded view
|

Re: regex help

Muskoka Auto Parts Limited
In reply to this post by Muskoka Auto Parts Limited
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 16-May-07, at 1:05 PM, Tim Chase wrote:

> :g/^\s\{8}/-s/^\(.\{8}\).*\n\zs\s\{8}/\1

I don't understand that regex completely - but it deletes lines of  
data :-)

Looks like it globably matches the 'blank start' lines, then searches  
in that for the pattern - thus deleting the third line...

What's the -s do as compared to just s after the g/pattern/

Brian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Verify this email or encrypt your email for free - see gnupg.org

iD8DBQFGS003GnOmb9xIQHQRAv5VAKC6QXC8mLr7Fj0E5NDb47Wi1J5OjwCeLZkg
43BTgSoka6rAbFjotqeMT+g=
=gVmG
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: regex help

Muskoka Auto Parts Limited
In reply to this post by Gene Kwiecinski
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 16-May-07, at 2:23 PM, Gene Kwiecinski wrote:

>>> :g/^\(.{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/
>
>> Well,
>> :g/^\(.\{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/
>>        ^
>> works, but I'd have to run it 12 times if there are twelve blanks
>> after the filled in line.
>
> Hm?  Not sure why you escaped the '{'.  Apparently didn't need to  
> after
> the "\s".
>
You are right - it was a typo - I actually need to escape both of them -

:g/^\(.\{8}\)\(.*\)\n\(\s\{8}\)/s//\1\2\r\1/
        ^                 ^

Brian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Verify this email or encrypt your email for free - see gnupg.org

iD8DBQFGS030GnOmb9xIQHQRAkrxAKCQ2s0Z4CpdNbLlq5GB1Emfao9lrACgjWIs
U2W8X6FLElJSMaJyhYqxi4c=
=K8FT
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: regex help

Muskoka Auto Parts Limited
In reply to this post by Tim Chase-2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 16-May-07, at 2:27 PM, Tim Chase wrote:

> :g/^\s\+/k a|?^\S?y|'a|s/^\s\+/\=strpart(@", 0, strlen(submatch(0)))


That works....  dunno what it does... but that works.... :-)

I'm going to record that little gem, and put it aside for a bedtime  
puzzle I think.

Thank you

Brian

PS  I promise to use that little sucker at least 30 or 40 times in  
the next month or so - please believe me when I say I appreciate it.
     Currently I'm importing into Excel and using a really really  
nasty couple of 'if then else' formulas to achieve the same result!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)
Comment: Verify this email or encrypt your email for free - see gnupg.org

iD8DBQFGS0+KGnOmb9xIQHQRAsSPAKCRsno91NmR5FgbuXBvrxcSMZ8qjwCg1l4o
K4ABU7aeY4ftTDusWPEmfSI=
=Lhjc
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: regex help

Tim Chase-2
In reply to this post by Muskoka Auto Parts Limited
>> :g/^\s\{8}/-s/^\(.\{8}\).*\n\zs\s\{8}/\1
>
> I don't understand that regex completely - but it deletes
> lines of data :-)
>
> Looks like it globably matches the 'blank start' lines, then
> searches in that for the pattern - thus deleting the third
> line...

That's really odd...the "\zs" *should* be forcing the substitute
to start on the "next" line.  I suspect I've stumbled across an
odd bug?  Switching it to the following

   :g/^\s\{8}/-s/^\(.\{8}\)\(.*\n\)\s\{8}/\1\2\1

worked, though in theory *should* be the same sort of thing.
It's also much shorter and more readable than the "\=" version I
sent second.

> What's the -s do as compared to just s after the g/pattern/

The -s is a range/offset ("-", which is the same as "-1", meaning
"back one line") and the command ("s"ubstitute).

For your bedtime reading, they break down as

:g          on every line
/^\s\{8}/   starting with 8 whitespace characters
-           go to the previous line
s/          and substitute
^             from the beginning of the line
\(.\{8}\)     make note of 8 "whatever"s as "\1"

==========[non-working]========
.*            and skip the rest of the line
\n            and a newline
\zs           and start the replacement here
               treating everything before the \zs as
               merely required context
==========[working]============
\(.*\n\)      make note of the rest of the line as "\2"
===============================

\s\{8}     the 8 whitespace characters we want to replace


The replacement in the first [non-working] *should* simply be the
thing we tagged as \1.  In the second [working] version, the
replacement is "the first tagged thing followed by the second
tagged thing (namely, the whole previous line remains untouched)
followed by the first thing tagged thing again"

>> > :g/^\s\+/k a|?^\S?y|'a|s/^\s\+/\=strpart(@", 0, strlen(submatch(0)))
>
>
> That works....  dunno what it does... but that works.... :-)
>
> I'm going to record that little gem, and put it aside for a bedtime  
> puzzle I think.

It helps to break it at the pipes :)

:g/^\s\+       on lines beginning with some whitespace
k a            mark that line as "a"
|              and
?^\S?          search backwards for a line beginning w/ non-ws
y              yanking that line into the scratch register
|              and
'a             jump back to the "a" mark we placed earlier
|              and
s/             do a substitute
^\s\+          of the leading whitespace[*]
/              with
\=             the results of this expression
strpart(           a piece of
@",                the stuff we yanked previously
0,                 starting at the beginning
strlen(            and running for the length of
submatch(0)        the whitespace[*] we're replacing
))

It's a bit terse to say the least, and has slightly odd behaviors
if you have staircased indentation like

first line
     second line
          third line

where you'll end up with

first line
firssecond line
firsseconthird line

I'm glad they should save you oodles of time...that's one of the
great things about Vim :)

-tim
(is this email a sign I need to get a life? :)


Reply | Threaded
Open this post in threaded view
|

RE: regex help

Gene Kwiecinski
In reply to this post by Muskoka Auto Parts Limited
>>>Well,
>>>:g/^\(.\{8}\)\(.*\)\n\(\s{8}\)/s//\1\2\r\1/
>>>       ^
>>>works, but I'd have to run it 12 times if there are twelve blanks
>>>after the filled in line.

>>Hm?  Not sure why you escaped the '{'.  Apparently didn't need to  
>>after
>>the "\s".

>You are right - it was a typo - I actually need to escape both of them
-

>:g/^\(.\{8}\)\(.*\)\n\(\s\{8}\)/s//\1\2\r\1/
>        ^                 ^

Ah, lookit that...  '[' is normally magic unless you escape it to
literal text, and '{' is normally literal text unless you escape it to
magic.  Eerie, don't think I had occasion to put that to the test
before.

Hah, learn something new every day...

Been using "{#}" notation in lex/js/perl so long, don't think I had
occasion to actually use them in 'vim', else I would've run into that
problem before.