Quantcast

Basic text statistics calculation in Vim

classic Classic list List threaded Threaded
17 messages Options
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Basic text statistics calculation in Vim

tjg
This post was updated on .
I have written a small function which puts "WIP statistics" at the end of the file (pure text, no code) I am working on.

It looks like this (ts = 7)

Date     NbCar  NbWords NbSent NbLines
130813  21910  3640      310      180
130820  30310  5210      480      220

(NB : Date in the ymd format, Nb=number, Car=Characters, Sent=Sentences (separated by .!?…) , and Lines are, of course, non-blank lines and, thus, the equivalent of book paragraphs).

This function works. But I would like to add 2 "columns" :

- one about the final output : divide the NbCar by 1500 (in France a journalistic "feuillet"/page, I do not know if there is an equivalent elsewhere) ; here it would indicate that a week ago I had written 15 feuillets (rounded upwards), and this week, 20 feuillets : a 250 pages book in a year, "In search of lost time" much later, genius not included…

- one about simple readability : divide the number of words by the number of sentences.

How should I proceed ?

For the first one (output) I tried to move the cursor on the number (e.g. 30310), enter insert mode then <Ctrl-R>= followed by <Ctrl-R><Ctrl-W>/1500, but failed miserably.

As for the second one (readability), I simply cannot figure it out.

Thanks in advance
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Tim Chase
On 2013-08-20 09:16, tjg wrote:

> Date     NbCar  NbWords NbSent NbLines
> 130813  21910  3640      310      180
> 130820  30310  5210      480      220
> This function works. But I would like to add 2 "columns" :
>
> - one about the final output : divide the NbCar by 1500 (in France a
> journalistic "feuillet"/page, I do not know if there is an
> equivalent elsewhere) ; here it would indicate that a week ago I
> had written 15 feuillets (rounded upwards), and this week, 20
> feuillets : a 250 pages book in a year, "In search of lost time"
> much later, genius not included…
>
> - one about simple readability : divide the number of words by the
> number of sentences.

You could run something like this over the range of applicable lines:

:'<,'>s@^\d\{6\}\s\+\(\d\+\)\s\+\(\d\+\)\s\+\(\d\+\)\s\+\(\d\+\)\zs.*@\="\t".(submatch(1)/1500)."\t".(submatch(2)/submatch(3))

captures \date_/    \NbCar_/    \NbWord/   \NbSent/    \NbLine/

It separates the two new columns with tab characters, but you can
tweak the "\t"s to be whatever you need.

Note that Vim only deals in integers by default, so you might either
want to multiply the numerators by a fixed amount (such as 100) for
greater precision, or convert the various part to floats using the
"str2float()" function.

-tim


--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tooth pik
In reply to this post by tjg
On Tue, Aug 20, 2013 at 09:16:20AM -0700, tjg wrote:
> I have written a small function which puts "WIP statistics" at the end of the
> file (pure text, no code) I am working on.

> It looks like this (ts = 7)

> Date     NbCar  NbWords NbSent NbLines
> 130813  21910  3640      310      180
> 130820  30310  5210      480      220

> (NB : Date in the ymd format, Nb=number, Car=Characters, Sent=Sentences
> (separated by .!?…) , and Lines are, of course, non-blank lines and, thus,
> the equivalent of book paragraphs).

> This function works. But I would like to add 2 "columns" :

> - one about the final output : divide the NbCar by 1500 (in France a
> journalistic "feuillet"/page, I do not know if there is an equivalent
> elsewhere) ; here it would indicate that a week ago I had written 15
> feuillets (rounded upwards), and this week, 20 feuillets : a 250 pages book
> in a year, "In search of lost time" much later, genius not included…

> - one about simple readability : divide the number of words by the number of
> sentences.

> How should I proceed ?

I am of the opinion, and I freely admit it's my own bias, that it is
redundant to store calculated fields in data.  If your feuillet page
is always NbCar / 1500 I see no reason to store it.  Simply calcluate
it on the fly when you display it.

As an example, I keep gas mileage statistics for my motorcycle.  All I
store are the date, mileage, miles on tank (my mileage odometer
doesn't have tenths, so I use trip odometer A for gasoline purchases),
number of gallons purchased, and the miles remaining in tank reading
from the bike's computer (I like to test its accuracy - I find this
highly entertaining).  Then when I display the data I use cat and pipe
the output through awk (with a lot of other bells and whistles) and
let awk calculate my mpg and actual miles remaining in tank before it
shows it all to me.

It's a simple awk because my data is reliably static, so I can show it
here:

<awk>
{ /* print */
  dt = $1
  mileage = $2
  miles = $3
  gal = $4
  rem = $5
  arem = (6 - gal) * (miles / gal)
  if (length(miles) > 0 && length(gal) > 0) {
      if (length(rem) > 0) {
          printf "%10s %8s %7s %7s %6.1f %5.0f  %5d  %6.1f\n", dt, mileage, miles, gal, miles / gal, miles * 6.0  / gal, rem, arem
      } else {
          printf "%10s %8s %7s %7s %6.1f %5.0f\n", dt, mileage, miles, gal, miles / gal, miles * 6.0  / gal
      }
  } else {
      print
  }
}
</awk>

with apologies for mailer wrap.

You could easily modify this to your purpose...

--
_|_ _  __|_|_ ._ o|  
 |_(_)(_)|_| ||_)||<
              |      

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
In reply to this post by Tim Chase
@TimChase : thanks for your answer, but I run into an error E20 : Mark not set. I must have made a mistake (I put the cursor on the second data line and ran your command : was I supposed to do that, or must I insert something like :2,3'<,'>s etc...?).

About the floating point, I will be very satisfied with "crude" i.e. rounded results.
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
In reply to this post by tooth pik
@ToothPick : thanks for your answer, but I do not use awk, because I confess I do not have it on my device (Android, with only VimTouch, which can be quite efficient if you take into account that you can dictate your text and dictate it anywhere).

Nevertheless, thank you very much for your considerate answer.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Ben Fritz
In reply to this post by tjg
On Tuesday, August 20, 2013 11:16:20 AM UTC-5, tjg wrote:

> - one about the final output : divide the NbCar by 1500 (in France a
>
> journalistic "feuillet"/page, I do not know if there is an equivalent
>
> elsewhere) ; here it would indicate that a week ago I had written 15
>
> feuillets (rounded upwards), and this week, 20 feuillets : a 250 pages book
>
> in a year, "In search of lost time" much later, genius not included…
>
>
>
> - one about simple readability : divide the number of words by the number of
>
> sentences.
>
>
>
> How should I proceed ?
>
>
>
> For the first one (output) I tried to move the cursor on the number (e.g.
>
> 30310), enter insert mode then <Ctrl-R>= followed by <Ctrl-R><Ctrl-W>/1500,
>
> but failed miserably.
>
>

Once you enter insert mode the cursor may not be on the number anymore. I'd yank it to a register first, then paste that. This has the added benefit that you can paste it anywhere at all, not with the cursor over other text you might want to keep.

For example, place the cursor on the 30310 somewhere, do "cyiw to copy the number to the 'c' register/clipboard. Then enter insert mode wherever you like (doesn't need to be around the same number anywhere) and do <C-R>c.0/1500.

Note the ".0" I appended to the register contents. This forces Vim to do floating-point math instead of integer math, which will just ignore any fractional components.

Explanation:

"c just specifies that you want to use the 'c' register for the next operation
y means do a "yank" operation
iw specifies *what* to yank, in this case a single word, without any surrounding whitespace.

<C-R>c (i.e. hold down CTRL, press R, let go of CTRL, press C) inserts the contens of the 'c' register

>
> As for the second one (readability), I simply cannot figure it out.
>

Well, you already have number of words, and number of sentences. Repeat the process above using two different registers. E.g. yank the words into 'w' with "wyiw and yank the sentences into 's' with "syiw. Then use <C-R>w and <C-R>s to get the corresponding numbers into the expression register.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Tim Chase
In reply to this post by tjg
On 2013-08-20 12:19, tjg wrote:
> @TimChase : thanks for your answer, but I run into an error E20 :
> Mark not set. I must have made a mistake (I put the cursor on the
> second data line and ran your command : was I supposed to do that,
> or must I insert something like :2,3'<,'>s etc...?).

I'd presumed you were visually selecting the range of lines you
wanted to perform the calculations, so I used the range

  :'<,'>

which means "from the first line of the visual selection, through the
last line of the visual selection".  So common in fact, that when
you're in visual mode and hit the colon, it auto-populates that range
on the assumption that's what you want to operate on.

So you can specify any range you want.  If your example text is the
sole content of the file, you could do

  :2,$

to indicate that you want to operate from line#2 through the end of
the file (skipping line #1 which had the headers; if you included
line#1, it would just throw up peculiar warnings about the bad values
in the math for that line).  If you don't have headers and your
entire file consists of your data, you can just use the short-hand
range

  :%

to indicate the whole file.

There's a lot more power to Vim's ranges, which you can read about at

  :help :range

-tim


--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
This post was updated on .
@TimChase : it works perfectly, of course (I use visual mode very seldom, as you can guess...).
Thanks you very much.
I make a nuisance of myself, but would it be possible - when you have time - to explain your regex ? For (my) education's sake !
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
In reply to this post by Ben Fritz
@BenFritz : thank you for your answer, but I face a problem : I proceeded as you told me, but ended - in insert mode - with 30310.0/1500 ... Sorry, but I must have misunderstood part of the process.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Christian Brabandt
In reply to this post by tjg
On Di, 20 Aug 2013, tjg wrote:

> I have written a small function which puts "WIP statistics" at the end of the
> file (pure text, no code) I am working on.
>
> It looks like this (ts = 7)
>
> Date     NbCar  NbWords NbSent NbLines
> 130813  21910  3640      310      180
> 130820  30310  5210      480      220
>
> (NB : Date in the ymd format, Nb=number, Car=Characters, Sent=Sentences
> (separated by .!?…) , and Lines are, of course, non-blank lines and, thus,
> the equivalent of book paragraphs).
>
> This function works. But I would like to add 2 "columns" :
>
> - one about the final output : divide the NbCar by 1500 (in France a
> journalistic "feuillet"/page, I do not know if there is an equivalent
> elsewhere) ; here it would indicate that a week ago I had written 15
> feuillets (rounded upwards), and this week, 20 feuillets : a 250 pages book
> in a year, "In search of lost time" much later, genius not included…
>
> - one about simple readability : divide the number of words by the number of
> sentences.
>
> How should I proceed ?
>
> For the first one (output) I tried to move the cursor on the number (e.g.
> 30310), enter insert mode then <Ctrl-R>= followed by <Ctrl-R><Ctrl-W>/1500,
> but failed miserably.
>
> As for the second one (readability), I simply cannot figure it out.
>
> Thanks in advance

I just committed an update to the csv filetype plugin¹

Now you can do this:

1) [Append a column that is the result of column 2/1500]
:2,$s#$#\=printf("%.2f", (CSVField(2,line('.'))+0.0)/1500)#

This is one single command and it uses some expression evaluation of the
:s command to perform the calculation
(Note: the useage of # instead of the usual '/' delimiter, since we need
the / to divide the numbers, note also, we need to leave out the 1st row
as it does not contain numbers)

2) [Append a column that is the result of column 3/column 4]
:2,$s#$#\=printf("%.2f%s",
(CSVField(3,line('.'))+0.0)/(CSVField(4,line('.'))+0.0), b:delimiter)#

Again, entered as one line. Note the usage of the buffer local variable
"b:delimiter" which holds the actual delimiter variable (in your case a
"\t" tabulator).


See some more basics at
:h sub-replace-expression

And of course, if you have the csv filetype plugin installed:
:h ft-csv.txt
:h csv-calculate-column

Hope this helps a little. If you have questions feel free to mail me.

¹)https://github.com/chrisbra/csv.vim

regards,
Christian
--
Ihr, die ihr noch jung seid, hört einen Alten, auf den die Alten
hörten, als er noch jung war!
                -- Kaiser Augustus, bei Plutarch

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Ben Fritz
In reply to this post by tjg
On Tuesday, August 20, 2013 3:21:33 PM UTC-5, tjg wrote:
> @BenFritz : thank you for your answer, but I face a problem : I proceeded as
>
> you told me, but ended - in insert mode - with 30310.0/1500 ... Sorry, but I
>
> must have misunderstood part of the process.
>

Nope, I accidentally omitted a <C-R>= which you had in there before. My fault.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Mikołaj Machowski
In reply to this post by Ben Fritz
Dnia Wtorek, 20 Sierpnia 2013 18:16 tjg <[hidden email]> napisał(a)

> I have written a small function which puts "WIP statistics" at the end of the
> file (pure text, no code) I am working on.
>
> It looks like this (ts = 7)
>
> Date     NbCar  NbWords NbSent NbLines
> 130813  21910  3640      310      180
> 130820  30310  5210      480      220
>
> (NB : Date in the ymd format, Nb=number, Car=Characters, Sent=Sentences
> (separated by .!? ) , and Lines are, of course, non-blank lines and, thus,
> the equivalent of book paragraphs).
>
> This function works. But I would like to add 2 "columns" :
>
> - one about the final output : divide the NbCar by 1500 (in France a
> journalistic "feuillet"/page, I do not know if there is an equivalent
> elsewhere) ; here it would indicate that a week ago I had written 15
> feuillets (rounded upwards), and this week, 20 feuillets : a 250 pages book
> in a year, "In search of lost time" much later, genius not included  
>

What about more program approach using VimL - but you need to use whole function for that. Just to get number (don't sure where you want insert those numbers):

navigate to line with your week date and:

:echo ceil(split(getline('.'), '\s\+')[1]/1500.0)
15.0

for 'feuillets' (in Poland it is called 'standardowy maszynopis' and has 1800 characters BTW)

> - one about simple readability : divide the number of words by the number of
> sentences.

:let a=split(getline('.'), '\s\+') | echo a[2]/a[3]
11

Note: here you will get full number which Vim will always round down.

getline('.') - read current line
split(getline('.'), '\s\+\) - split current line into numbers without any whitespace
ceil() - round upward
a[2], a[3] - operate on numbers

m.


--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
This post was updated on .
@Bugzilla : Tried your 2 answers, both worked superbly. Thank you very much (this gave me - as it should - ideas for other uses in other files !)
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
In reply to this post by Ben Fritz
@BenFritz : tried again, failed again (got the answer c : undefined variable). Here is what I did :

a) (normal mode) : "cyiw (on 30310)
b) (insert mode) : <C-R>=c.0/1500

I am sure the problem is on my side, but where ?

Thanks again
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
In reply to this post by Christian Brabandt
@ChristianBrabandt : thank you for your answer, but I am limited (Android device) to Vim (VimTouch) only with no plugin (I suppose it is possible to install your plugin, but I freely admit I do not know how to do that.

Thank you very much, anyway
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

Ben Fritz
In reply to this post by tjg
On Wednesday, August 21, 2013 4:17:47 AM UTC-5, tjg wrote:

> @BenFritz : tried again, failed again (got the answer c : undefined
>
> variable). Here is what I did :
>
>
>
> a) (normal mode) : "cyiw (on 30310)
>
> b) (insert mode) : <C-R>=c.0/1500
>
>
>
> I am sure the problem is on my side, but where ?
>

Now you omitted the <C-R> to get the 'c' register.

Your insert-mode should be:

  <C-R>=<C-R>c.0/1500<Enter>

<C-R> insert a register
= choose the "expression register" which drops you into a special mode to enter the expression to evaluate
<C-R> insert a register in the special expression prompt mode
c insert contents of register 'c'
.0/1500 insert this text literally
<Enter> evaluate the resulting expression and put it into the buffer where you are inserting

<C-R>= is actually just a special case of <C-R> with a register. See :help @=, :help i_CTRL-R_=, :help registers.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
tjg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Basic calculation in Vim

tjg
Thanks to your patience, I finally succeeded... And your explanations are very useful.

Thank you very much.
Loading...