pattern for substitution including linefeed and carriage return

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

pattern for substitution including linefeed and carriage return

Erhy
Hello!
I got an CSV file in which some items have more lines
and I want to delete them. This items have also textmarkers "
e.g.
30.11.2017;"Name";"Legend
for name
is not found";"New York";

How I remove such

"Legend
for name
is not found"

with wildcards focused only on linefeeds in such items.

Thank you Erhy

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

Gary Johnson-4
On 2017-12-10, Erhy wrote:

> Hello!
> I got an CSV file in which some items have more lines
> and I want to delete them. This items have also textmarkers "
> e.g.
> 30.11.2017;"Name";"Legend
> for name
> is not found";"New York";
>
> How I remove such
>
> "Legend
> for name
> is not found"
>
> with wildcards focused only on linefeeds in such items.

A pattern that will match that string is

    "Legend\nfor name\nis not found"

\n matches the end-of-line marker in the Vim buffer.  Vim converts
a file's end-of-line markers to its internal end-of-line markers
when it reads a file, and writes the appropriate end-of-line markers
to a file when it writes a file, according to the setting of
'fileformat', so you don't need to concern yourself with whether the
file uses LF or CR-LF at the ends of lines.

To delete all lines containing that string, you could use:

    :g/"Legend\nfor name\nis not found"/.,+2d

The :g/<pattern>/d command deletes only the line at which the
pattern matches, which would delete only the line containing
'"Legend'.  The .,+2 is needed to delete all three lines.

See

    :help /\n
    :help 'fileformat'
    :help /\_.
    :help :range

HTH,
Gary

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

C.v.St.
In reply to this post by Erhy

Am 12/10/2017 um 05:08 PM schrieb Erhy:
> ... This items have also textmarkers " e.g.
> 30.11.2017;"Name";"Legend
> for name
> is not found";"New York";
So You want to delete the contents of cases, where
the Last thing of the line is not the ", followed
by line(s) with no " at all, up to the line, which
contains, but does not begin with "?
(Which misses some cases, and will work only mostly,
as it depends on having the single ;" in the first
and the single "; on the last line of change.
So Lines with broken pairs of ", or with extra newline
without ", or lines with newline in the first or the
last field, would break or be ignored.)

The simple case might be done with:

:s/;"[^"]*\n\([^"]*\n\)[^"]*";/;"";/

OR do you want to remove the newlines only and keep
the text? Which I believe would be more complicated,
because of the newlines inside of the \(...\) pair.

Stucki

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

Tim Chase
In reply to this post by Gary Johnson-4
On 2017-12-10 09:52, Gary Johnson wrote:
> A pattern that will match that string is
>
>     "Legend\nfor name\nis not found"
>
>     :help /\_.

The "\_" convention holds for things other than "." to add the "and
include newline" connotation, so you can change your spaces to

  Legend\_s\+for\_s\+name\_s\+is\_s\+not\_s\+found

which would allow a new-line as part of any of the whitespace in your
sentence.

-tim


--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

Erhy
In reply to this post by C.v.St.
Stucki, thanks for your answer!

In my CSV file this multiline fields have
different number of lines, least two
and contain various text.

Is it also possible to delete such items at once?

Thank you all

Erhy



Am Sonntag, 10. Dezember 2017 20:01:55 UTC+1 schrieb C.v.St.:

> Am 12/10/2017 um 05:08 PM schrieb Erhy:
> > ... This items have also textmarkers " e.g.
> > 30.11.2017;"Name";"Legend
> > for name
> > is not found";"New York";
> So You want to delete the contents of cases, where
> the Last thing of the line is not the ", followed
> by line(s) with no " at all, up to the line, which
> contains, but does not begin with "?
> (Which misses some cases, and will work only mostly,
> as it depends on having the single ;" in the first
> and the single "; on the last line of change.
> So Lines with broken pairs of ", or with extra newline
> without ", or lines with newline in the first or the
> last field, would break or be ignored.)
>
> The simple case might be done with:
>
> :s/;"[^"]*\n\([^"]*\n\)[^"]*";/;"";/
>
> OR do you want to remove the newlines only and keep
> the text? Which I believe would be more complicated,
> because of the newlines inside of the \(...\) pair.
>
> Stucki
--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

C.v.St.


Am 12/10/2017 um 10:50 PM schrieb Erhy:

> In my CSV file this multiline fields have
> different number of lines, least two
> and contain various text.
>
> Is it also possible to delete such items at once?

>> :s/;"[^"]*\n\([^"]*\n\)[^"]*";/;"";/

As usual :%s..... applies the substitution
to all lines of a file (the complete syntax
of ranges is seen in ":help cmdline-ranges")

Stucki

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

Eike Rathke-3
In reply to this post by Erhy
Hi Erhy,

On Sunday, 2017-12-10 08:08:36 -0800, Erhy wrote:

> I got an CSV file in which some items have more lines
> and I want to delete them. This items have also textmarkers "
> e.g.
> 30.11.2017;"Name";"Legend
> for name
> is not found";"New York";

As a side note, you are aware that such multi-line field content is
valid in CSV if enquoted? If for some reason the processing software
isn't capable to cope with multi-line content I'd rather suggest to only
replace the embedded newlines with spaces, so the actual field content
is preserved instead of stripped.

  Eike

--
OpenPGP/GnuPG encrypted mail preferred in all private communication.
GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
Care about Free Software, support the FSFE https://fsfe.org/support/?erack
Use LibreOffice! https://www.libreoffice.org/

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

Erhy
Am Montag, 11. Dezember 2017 17:38:43 UTC+1 schrieb Eike Rathke:
> Hi Erhy,
>
> As a side note, you are aware that such multi-line field content is
> valid in CSV if enquoted?

It's also odd for me. But the file comes from a banking institution.
There are only linefeeds in the file and it is art to find the next logical line.
The items which span more line have
 "
to mark them as text item

Erhy

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

C.v.St.


Am 12/11/2017 um 11:06 PM schrieb Erhy:
...
>> valid in CSV if enquoted?
>
> It's also odd for me. But the file comes from a banking institution.

Well, then it's not 'odd' but typical. You get something like:
-------------------------------- kind of symbolically:
first some text like title or explanations, then column/field) names
field;names;separated;by semicolon;and;explaining the following lines
data1;data2;data3;data4 blanks OK;OK! OK;more data up to EOL
DATA1;DATA2;;;;" and like an address with quoted multiple lines
Who
Where
City
etc.etc.etc."
or;even;"a first field
with newline";"a
second
field
with
newlines";still in line 3 of data;end of field 6 of logical line 3
---------------------------------------------------------------------
as long as you have e.g. 5 separators (here ';' so 6 fields of data)
and all cases of 'newline in a field' are quoted in double quotes,
this is correct for the so called 'Comma Separated Values'.
(Where 'banking', mostly uses semicolon for the delimiter,
because '.' is often used in fields.

See more at https://tools.ietf.org/html/rfc4180

BUT parsing such input by vim (i.e. 'regexp') seems to me to be
overly complex (or even unworkable?). Number or fields and newlines
are unlimited. So the structure is a lot easier to work with by
'real' CSV Libraries (e.g. in Perl or Python) or spreadsheet programs.
(Or even the simple 'csvtool' on command line in Linux)

Stucki

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: pattern for substitution including linefeed and carriage return

Erhy
Thank you Stucky,
 but I have no Linux.

With my old MS Word 2003 I was able to delete the items with linefeeds

Erhy


Am Mittwoch, 13. Dezember 2017 15:04:09 UTC+1 schrieb C.v.St.:

>
> See more at https://tools.ietf.org/html/rfc4180
>
> BUT parsing such input by vim (i.e. 'regexp') seems to me to be
> overly complex (or even unworkable?). Number or fields and newlines
> are unlimited. So the structure is a lot easier to work with by
> 'real' CSV Libraries (e.g. in Perl or Python) or spreadsheet programs.
> (Or even the simple 'csvtool' on command line in Linux)
>
> Stucki
--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.